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Generation of diversity in combinatorial libraries 

Biotech evolutionary methods, including combinatorial libraries 
and phage-display technology (PARMLEY & SMITH 1988; SCOTT & 
SMITH 1990; SMITH 1993), are used in the search for novel 
ligands of diagnostic, biomedical and pharmaceutical use 
(reviews; CORTESE 1996; COLLINS 1997). These methods, which use 
empirical procedures to select molecules with required 
characteristics, e.g. binding properties, from large 
populations of variant gene products has been compared to the 
process of natural evolution. Evolution includes the generation 
of mutation, selection of functionality over a time period and 
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the ability of the systems to self -replicate . In particular 
natural systems use recombination to reassert mutations 
accumulated in the selected population to exponentially 
increase the combinations of mutations and thus increase the 
number of variants in the population. This latter aspect, 
namely the introduction of recombination within mutant genes 
has only recently been applied to biotech evolutionary methods, 
although it has been used to increase the size of initial 
phage-display libraries (e-g. WATERHOUSE 1993; TSURUSHITA 1996; 
SODOYER 1994; FISCH 1996), STEMMER 1994a, 1994b and 1995 teach 
that recombination amongst a population of DNA molecules can be 
acheived in vitro by PGR amplification of a mixture of small 
overlapping fragments with (1994a, 1994b) or without (STEMMER 
1995) primer oligonucleotide sequences being used to drive the 
PGR reaction. The method is not applicable to recombination 
within a fully randomized (highly mutated) sequence since the 
method relies on high homology of the overlapping sequences at 
the site of recombination. STEMMER 1994b and GRAMERI 1996a do, 
however, demonstrate the usefulness of in vitro recombination 
for molecular evolution, where GRAMERI 1996b also demonstrate 
the use of the method in conjunction with phage-display, even 
though their method is confined to regions of low mutant 
density ( ca, 0.5-1% of the bases are mutated in their method) 
as they state "the advantages of recombination over existing 
mutagenesis methods are likely to increase with the numbers of 
cycles of molecular evolution,, (STEMMER 1994b) . We point out 
that this is due to the self-evident fact that the number of 
variants created by mutagenesis introducing base changes in 
existing mutant structures is an additive i.e., a linearly 
increasing function, whereas the use of recombination between 
mutated variants yields novel variants as an exponential 
function of the initial number of variants. The classic^il 
phage-display libraries are thus at a grave disadvantage for 



wo 98/33901 PCT/EP98/00533 

-3- 

the generation of novel variants; e.g. to encompass all the 

8 XO 

possible variants of an octapeptide sequence 20 =2.56 x 10 
different variants would be required. 

MARKS 1992 State the importance of recombination in the 
generation of higher specificity in combinatorial libraries 
e.g. in attaining antibodies of higher specificity and binding 
constants in the form of reshuffling light and heavy chains of 
immunoglobulins displayed in phage-display libraries. These 
authors do not instruct how the shuffling of all the light and 
heavy chains in a population heterogeneous in both chains can 
be acheived, e.g. by a vector allowing recombination. Heavy and 
light chains were selected one after the other, i.e. an optimal 
heavy chain first selected from a heterogenous heavy chain 
population in the presence of a constant light chain, then by 
preparing a new library, an optimal light chain in combination 
with the preselected optimal heavy chain. The extensive time 
consuming sequential optimization strategies currently utilized 
including consensus-mutational libraries, in vivo mutagenesis, 
error-pone PGR as well as chain shuffling are summarized in 
Figures 5 and 6 of COLLINS 1997. 

General background to phage and phage-display libraries 

, Gene libraries are generated containing extremely large number 
(10^ to 10"^^) of variants. The variant gene segments are fused 
to a coat protein gene of a filamentous bacteriophage (e.g. 
M13, fd or f 1) , and the fusion gene is inserted into the genome 
of the phage or of a phagemid. A phagemid is defined as a 
plasmid containing the packaging and replication origin of the 
filamentous bacteriophage. This latter property allows the 
packaging of the phagemid genome into a phage coat when it is 
present in an Escherichia coli host strain infected with a 
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filamentous phage (superinfection) . The packaged particles 
produced, be they phage or phagemid, display the fusion protein 
on the surface of the particles secreted into the medium. Such 
packaged particles are able to inject their genomes into a new 
host bacterium, where they can be propogated as phage or 
plasmids, respectively. The special property of the system lies 
in the the fact that since the packaging takes place in 
individual cells usually infected by a single variant 
phage /phagemid, the particles produced on propogation contain 
the gene encoding the particular variant displayed on the 
particle's surface. Several cycles of affinity selection for 
clones exhibiting the required properties due to the particular 
property of the variant protein displayed, e.g. binding to a 
particular target molecule immobilized on a surface, followed 
by amplification of the enriched clones leads to the isolation 
of a small number of different clones having these properties. 
The primary structure of these variants can then be rapidly 
elucidated by sequencing the hypermutated segment of the 
variant gene. 

Efficiency of producing combinatorial libraries 

There are a number of factors which limit the potential of this 
technology. The first is the number and diversity of the 
variants which can be generated in the primary library. Most 
libraries have been generated by transformation of ligated DNA 
preparations into Escherichia coli by electroporation. This 
gives an efficiency of ca. 0.1 to Ix 10^ recombinants 
/microgram ligated phage DNA. The highest cloning efficiency 
reported {of 10*' recombinants per microgram insert DNA) is 
obtained using special lambda vectors into which a single 
filamentous phage vector is inserted, in a special cloning 
site, bracketted by a duplication of the filamentous phage 
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replication/packaging origin (AMBERG 1993; HOGREFE 1993a+b) . 
The DNA construct is efficiently introduced into the 
Escherichia. coli host after packaging into a lambda 
bacteriophage coat in an in vitro lambda packaging mix. 
Infection of a strain carrying such a hybrid phagemid by an 
M13- helper phage allows excision and secretion of the insert 
packed in a filamentous phage coat. Neither AMBERG 1993 nor 
HOGREFE 1993a+b instruct on how the method may be used to 
introduce recombination during this procedure. Although they 
mention that the efficiency may be improved by the use of type 
lis restriction endonucleases during the construction of the 
concatemers used as substrate for the in vitro packageing no 
examples are given and in the ensuing five years no examples 
have appeared in the literature. The procedure described in our 
invention also uses the high efficiency of the invitro lambda 
packaging , but maximizes the capacity of the cloning vector by 
using a cosmid vector (8) in which many copies (say 8) of the 
phagemid are inserted in each construct. One of the surprising 
innovative aspects of this procedure is the discovery of a 
number of protocols for the de novo synthesis of large 
hypervariable libraries. One type is particularly efficient, 
in that phagemid/cosmid vectors are forced to integrate into 
the hybrid concatamers oriented in the same orientation. Any 
variant of the protocol which does not ensure this feature does 
not work efficiently • 

The use of type lis restriction endonucleases 

SZYBALSKI 1991 teaches a large number of novel applications for 
type lis restriction endonucleases, including precise trimming 
of DNA, retrieval of cloned DNA, gene assembly, use as a 
universal restriction enzyme, cleavage of single- stranded DNA, 
detection of point mutations, tandem amplification, printing 
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amplification reactions and localisation of methylated bases. 
They do not give any instruction as to how such enzymes can be 
used in the creation of recombination within highly mutated 
regions, e.g. within a combinatorial library. 
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According to a first embodiment the invention concerns a bank 
of genes, wherein said genes comprise a double stranded 
DNA sequence which is represented by the following formula 
of one of their strands: 

5*BxB2B3 . . .BnXn+i. . .Xn+a2n+a+l2n+a+2^n4-a+3 • • •Xn+a+bQn4-a+b+ 
1- • -Qn+a+b+j*^ ' 

wherein n, a, b and j are integers and 
n>3, a>l, b>3 and j > 1 , 

wherein X^^.^ . . .X^+a+b ^ hypervariable sequence and B, 
X, Z and Q represent adenine (A) , cytosine (C) , guanine 
(G) or thymine (T) , 

(i) Z represents G or T at a G:T ratio of about 1:1, 
and/or 

(ii) Z represents C or T at a C:T ratio of about 1:1, 
and/or 
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(iii) Z represents A or G at a A:G ratio of about 1:1, 
and/ or 

(iv) Z represents A or C at a A: C ratio of about 1:1, and 
wherein 

subsequences Bi...Bj^ and/or Qn+a+b+1 * • • Qn+a+b-t- j represent 
recognition sites for restriction enzymes, and wherein the 
recognition sites are orientated such that their cleavage 
site upon cleavage generates a cohesive end including the 
two bases designated Z. 

Restriction of this sequence with a type lis restriction enzyme 
as thus described, followed by religation leads to the 
recombination of the hypervariable regions located 5 ' and 3 ' of 
the cleavage site. This is the essence of the methodology which 
we designate ^cosmix-plexing"' . It is essential in this 
procedure that the fragments generated on cleavage by the 
restriction enzyme are religated in the correct 
orientation {,,head-to-tail'') , whereby the Z sequences are chosen 
for the four libraries ( (i) to iv) ) so as to ensure this (see 
below) yet still allowing all possible amino-acids to be 
encoded at the cleavage site. If this correct orientation is 
not ensured there will be a drastic reduction in both the 
percent of correctly reconstituted fusion-protein genes, a 
reduction in the proportion of molecules which can be packaged 
in vitro in the lambda -packaging extracts (which requires the 
correct orientation of the cos-sites) , as well as a reduction 
in the proportion of in vivo excisable phagemid copies from the 
cosmid concatemer { excision requires the correct orientation 
of consecutive phage replication origins) . 
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correct orientation 



correct orientation incorrect 



orientation {head-to-head ligation) 



>XGG/x 



> 



>XCC/x > 



>XQG/Y > 



Y/CCy< 



Y/GGy< 



Y/CCX< 

To prevent the problems arising from false orientation (head- 
to-head) mentioned in the previous paragraph, the four gene 
libraries mentioned in claim must be kept separated during 
cosmix-plexing. In fact with respect to the formation of 
recombinants the libraries behave as 16 separate sets which 
cannot recombine with each other: four libraries maintained 
separately, where each set contains four possible cohesive 
ends, e.g. library (i) with Z= G or T contains: 

51 >XGT/Y >, >XGG/Y >, >XTG/Y---->, 

and >XTT/Y > 

31 y/CA x< y/cc x< y/AC x< - 



It is evident that problems of false orientation will arise on 
mixing the different libraries, e.g. 

the „AC library (iv) will contain AA, AC, CA and CC sequences 
which can pair in the false orientation with, respectively each 
of the cohesive ends generated in library (i) . 



y/AA x< ---- 



A specific embodiment of the invention concerns a bank of genes 
wherein subsequences Bi- . .Bj^ or Qn+a+b+1* • *Qn+a+b+j 
represent recognition sites for restriction enzymes and 
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wherein the recognition sites are orientated such that 
their cleavage site upon cleavage generates a cohesive end 
including the two bases designated Z. 

Further, a specific embodiment concerns a bank of genes, 
wherein the cohesive end is a 2 bp single strand end 
foarmed by the two bases designated Z. 

Further, a specific embodiment concerns a bank of genes wherein 
each gene is provided as display vector, especially as M13 
phage or M13 like phage or as phagemid. 

Another embodiment of the invention concerns a set of four gene 
banks according to the invention wherein the gene banks 
are characterized as follows: 

- first gene bank: Z represents G or T, preferentially at 
a 

G:T ratio of about 1:1; 

- second gene bank: Z represents C or T, pref erantially at 
a 

C:T ratio of about 1:1; 

- third gene bank: Z represents A or G, preferentially at 
a 

A:G ratio of about 1:1; and 

- fourth gene bank: Z represents A or C, preferentially at 
a 

A:C ratio of about 1:1. 

A specific embodiment of the invention concerns a set of four 
gene banks wherein each gene is provided as display 
vector, especially as M13 phage or M13 like phage or as 
phagemid. 
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Another embodiment of the invention concerns a bank of gene3 
wherein said genes comprise a double stranded DNA sequence 
which is represented by the following formula of one of 
their strands: 

5*BiB2B3. , -BnXn+i. • •Xn+aZn+a+lZn+a+2Xn+a+3 • • •Xn+a+bQn+a+b+ 
1 • • *Qn+a+b+j3 ' 

wherein n, a, b and j are integers and 
n>3, a>l, b. >3 and i > 1 , 

wherein X^^+i . . ,Xn+a+b ^ hypervariable sequence and B, 

X, Z and Q represent adenine (A) , cytosine (C) , guanine 
(G) or thymine (T) , and wherein 

four sets of oligonucleotide sequences comprising Z^+a+l 
and Zn+a+2 present, preferentially at a ratio of 

(i) : (ii) : (iii) : (iv) of about 1:1:^:^, wherein the four 
sets are characterized as follows: 

first set: Zj^4.a+i represents G and Zn+a4-2 ^1^^ represents 
G; 

second set: 2^^.^+! represents C and Zn+a+2 represents T; 

third set: Z^+a+l represents A and Zn+a+2 represents A or 
C, preferentially at A:C ratio of about 1:1; and 

fourth set: Zn+a+1 represents T and Zn+a+2 represents C or 
G, preferentially at a C:G ratio of about 1:1, and wherein 
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sequences B^. . ,8^ and/or Qn+a^b+l- • -Qn+a+b+j represent 
recognition sites for restriction enzymes, wherein the 
recognition sites are orientated such that their cleavage 
site upon cleavage generates a cohesive end including the 
two bases designated Z. 

A specific embodiment of the invention concerns a bank of genes 
wherein the four sets of oligonucleotide sequences are 
present at a ratio of (i) : (ii) : (iii) : (iv) of (0 to 1) : (0 
to 1) : (0 to 1) : (0 to 1) with the proviso that at least one 
of said sets is present. 

Further, a specific embodiment of the invention concerns a bank 
of genes wherein subsequences Bi.,.Bn and/or 
Qn+a+b+1 • • -Qn+a+b+j represent recognition sites for 
restriction enzymes and wherein the recognition sites are 
orientated such that their cleavage site upon cleavage 
generates a cohesive end including the two bases 
designated 

Furhter, a specif ic embodiment of the invention concerns a bank 
of genes wherein the cohesive end is a 2 bp single strand 
end formed by the two bases designated Z. 

Another embodiment of the invention concerns bank of genes 

wherein said genes comprise a double stranded DNA sequence 
which is represented by the following formular of one of 
their strands: 

5'BiB2B3. . .B^Xj^+i. • • X^+a^n+a+l^n+a-f 2^n+a+ 3 • • •Xn+a+bQn+a+b+ 
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1 

• ♦ •Qn+a+b4-j3 ' 

wherein n, a, b and j are integers and 
n>3, a>l, b>3 and j > 1, 

wherein X^+i . . .X^+a+b is a hypervariable sequence and B, 
X, Z and Q represent adenine (A) , cytosine (C) , guanine 
(G) or thymine (T) , and wherein 

the following six sets of oligonucleotide sequences 
comprising X^^a' Zn+a+2 present, preferably 

at a ratio of (i) : (ii) : (iii) : (iv) : (v) : (vi) of about 
3:4:3:4:4:1, wherein the six sets are characterized as 
follows : 

first set: X^+a represents A, G and/or T, preferentially 
at a ratio of about 1:1:1 or X^+a represents C, G and/or 
T, preferentially at a rat lo of about 1:1:1/ 2j^^.a+i 
represents G and represents G; 

second set: Xji+a represents A, C, G and/or T, 
preferentially at a ratio of about 1:1:1:1, Z^+a+l 
represents C and Zn+a+2 represents T; 

third set: X^+a represents A, C and/or G, preferentially 
at a ratio of about 1:1:1, Z^+a+l represents A and Zn+a+2 
represents A; 

fourth set: X^+a represents A, C, G and/or T, 
preferentially at a ratio of about 1:1:1:1, Z^+a+l 
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represents A and Zn+a+2 represents C; 



fifth set: X^+a represents A, C, G and/or t, 
preferentially at a ratio of about 1:1:1:1, Z^+a+l 
represents T and 2^1+^+2 represents C; 



sixth set: X^^a represents A, Z^+a+l represents T and 
Zn+a+2 represents G. 



„Single-tube" method 
Problem 

A method should be developed which allows cosmix-plexing without maintaining separate 
libraries. This would have the advantage of reducing manipulation, involved in screening the 
four separate libraries, as previously described. This would offer a saving in both time and 
materials. This has been acheived in two separate versions of the invention. 

Solution 

It is possible to select combinations of nucleotides within the cohesive ends generated by type 
lis restriction within the aforementioned sequence, i.e. ZZ, in which all the clones are present 
in a single library and in which the possibility of false orientation during ligation, and the 
associated loss of efficiency associated with this, is eliminated. At the same time the number 
of subsets, defined by the number of different cohesive ends which can be generated, which 
cannot interact (recombine) with each other, is reduced from the 16 sets, as in the previously 
described version of the method, to 6, 

Designing the sequences 

The combinations of 2 bp single-strand cohesive end sequences which can be generated at ZZ 
are theoretically as follows: 



AA 


CA 


GA 


TA 


AC 


CC 


GC 


TC 


AG 


CG 


GG 


TG 


AT 


CT 


GT 


TT 



Of these, the sequences with an inverted symmetry axis (palindromes: AT, TA, GC, CG ), can 
pair in both orientations and are thus to be eliminated from cosmix-plexing libraries for the 
reasons given above. The remaining 12 sequences are actually 6 sets of complementary pairs ( 
e,g. CC+GG, AA+TT, CA-^TG), By choosing one partner from each pair (total of 6) a single 
set of cohesive ends can be generated which can pair only in the correct „head-to-tair' 
orientation. The actual choice of sequences takes the codon usage into account, assuming that 
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ZZ are chosen as the 2nd and 3rd position of the codon. Determining are the amino-acids 
which are encoded by either a single or only two codons( single codon methionine (TG) and 
tryptophan (GG); after elimination of the palindromic sequences there also only single codons 
available encoding aspartic acid (Asp), asparagine (Asn), cystine (Cys), histidine (His) and 
tyrosine (Tyr). To encode Asp, Asn, His and Tyr an AC sequence is required. Selecting AC 
has the default that the complimentary sequence GT must be avoided. This is the only 
possibility of encoding Cys. However, the inclusion of Cys within the hypervariable sequence 
often causes problems of misfolding and the formation of dimeric aggregates, dependent on 
the redox potential of the environment. It was thus decided to create a set in which Cys 
codons are eliminated, but which will be of great use in many applications, including cyclic 
peptide library formation. If the sequence AA is chosen to encode glutamic acid (Glu), 
glutamine (Gin) and lysine (Lys) also allowing the stop-codon TAA, then TT must be 
eliminated. The consequence of this is that TC must also be included so that phenylalanine 
(Phe) and isoleucine (He) can be encoded. The elimination of the complimentary GA is 
without consequence since other GG codon(s) encode argenine (Arg) and glycine (Gly). The 
elimination of CC is then without consequence, since alanine (Ala), proline (Pro), serine (Ser) 
and threonine (Thr) can be encoded by CT-containing codons. This is the argumentation for 
the selection of ZZ sequences designated „combination A" below. 

For the sake of completeness: if the doublet AA were left out and, consequently TT included, 
then AG must be included to encode Glu, Gin and Lys. In order to encode Ala and Pro, either 
CT (combination B) or CA (combination C) must now be included. This leads to the inclusion 
of either AG and CT (combi. B), or CA and TG (combi. C) as complimentary pairs. 
Combinations B and C thus do not represent an adequate solution to the problem. 



combination A 




combination B 






AA TT 


AA 


TT 


AA 


TT 


AC GT 


AC 


GT 


AC 


GT 


AG CT 


AG 


CT 


AG 


CT 


CA TG 


CA 


TG 


CA 


TG 


CC GG 


CC 


GG 


CC 


GG 


GA TC 


GA 


TC 


GA 


TC 



Sequences chosen are shown in bold type. Complementary pairs are adjacent to each other. 

Table 1: Genetic code; the selection of XZZ codons used according to combination A is 
shown in bold type. 



Ala 


Arg 


Asp 


Asn 


Cys 


Glu 


Gin 


Gly 


His 


He 


Leu 


GCA 


AGA 


GAC 


AAC 


TGC 


GAA 


CAA 


GGA 


CAC 


ATA 


TTA 


GCC 


AGG 


GAT 


AAT 


TGT 


GAG 


CAG 


GGC 


CAT 


ATC 


TTG 


GCG 


CGA 












GGG 




ATT 


CIA 


GCT 


CGC 












GGT 






CTC 




CGG 


















CTG 




CGT 


















CTT 


Lys 


Met 


Phe 


Pro 


Ser 


Thr 


Trp 


Tyr 


Val 


Stop 




AAA 


ATG 


TTC 


CCA 


AGC 


ACA 


TGG 


TAC 


GTA 


TAA 
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AAG TTT CCC AGT 

CCG TCA 
CCT TCC 
TCG 
TCT 



ACC TAT GTC TAG 

ACG GTG TGA 

ACT GTT 



Table 2. Frequency of the amino-acids, comparing the selected combination A (above) and the 
natural frequency of all codons. 



Amino-acid 


natural frequency 


Combination 


Ala 


4 


1 


Arg 


6 


2 


Asp 


2 




Asn 


2 




Cys 


2 




Glu 


2 




Gin 


2 




Gly 


4 




His 


2 




He 


3 




Leu 


6 




Lys 


2 




Met 


1 




Phe 


2 




Pro 


4 




Ser 


6 




Thr 


4 




Trp 


1 




Tyr 


2 




Val 


4 


2 


Stop 




1 


21 


64 


24 



Creation of a set of four oligonucleotides according to combination A 

Gene libraries can be created according the requirements of the combination A, by creating 
four sets of nucleotides in which X^^J.^^^{^^^^^2 ^J^^- 

i) NGG 

ii) NCT 
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iii) NA(AorC) 

iv) . NT (C or G), 
where N is C,G,A or T. 

After the synthesis of these oligonucleotides they can be combined to obtain a single-tube 
cosmix-plexing gene library, whereby to obtain the relative codon frequencies given in Table 
2 the gene libraries i) to iv) are present in the final mixture at a ratio of 1 : 1 : 2: 2, respectively. 
As explained above this mixture will always give a correct orientation on religation of type lis 
restriction enzyme-cleaved fragments having the 2bp single-stranded cohesive ends ZZ. 

Alternatively: a set of six oligonucleotides conforming to combination A 

Gene libraries can be created according a modification of combination A, in which both Stop 
and cystine codons are eliminated, and in which each of the other amino-acids is each 
represented by a single codon, by creating six sets of nucleotides in which Xn+aZn+a+|Zfi+a+2 
are: 

i) (A, G or T)GG or (C,G or T)GG 

ii) NCT 

iii) (A,GorC)AA 

iv) NAC 

v) NTC 

vi) ATG 

After the synthesis of these oligonucleotides they can be combined to obtain a single-tube 
cosmix-plexing gene library, whereby to obtain the equimolar codon frequencies for each 
amino-acid the gene libraries i) to vi) are present in the final mixture at a ratio of 3: 4: 3: 4: 4: 
4: 1 respectively. As explained above this mixture will always give a correct orientation on 
religation of type lis restriction enzyme-cleaved fragments having the 2bp single-stranded 
cohesive ends ZZ. 

Again, as with the previous sets this single-tube library represents six-subsets which are 
unable to recombine with each other during cosmix-plexing. 

*** the following section has been altered radically. The last Tables are no longer necessary. 
Consideration of the central amino-acid codon created during cosmix-plexing 
recombination 

The amino-acid at the recombination site is determined by the 5'-hypervariable segment. The 
set of amino-acids which may be represented at this position is defined for each subset as 
presented in Table 2, 

Consideration of the number of clones needed in a ^representative" library 

The minimal number of clones required in a library to include all possible amino-acid 
seauences in a random peptide containing 'n' amino-acids is 20", i.e. for n=9, 20 = 5.12 x 
10 . In fact, at a confidence limit of say 95%, this figure must be some three-fold higher, to 
allow for the statistics of sampling , i.e. ca. 1.5 x 10^^. In practice this figure may be higher 
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due to, e.g. non-random synthesis of the oligonucleotides used to generate the library as well 
as biased codon representation (for a detailed discussion see Collins 1997). 

Consideration of the number of recombined clones generated by cosmix-plexing 

The cosmix-plexing strategy is based on the concept that in initial selection experiments clone 
populations will be enriched for sequences which contain structural elements based on the 
primary sequence in the varied segment. Even if the optimal sequence is not present due to 
the limitations imposed by the limited size of the initial library, cosmix-plexing will increase 
the Ulcelihood of finding just such a sequence by providing a large number of novel 
recombinants in which the 5'- and 3'-,,halves" of the varied section are reassorted e.g. for the 
hypervariable nonapeptide library described in the example, the sequences encoding the 
amino-proximal five amino acids are recombined with the sequences encoding the carboxy- 
proximal four amino-acids. Since the cohesive ends essentially limit the recombination to 
defined subsets, in which one subset cannot undergo recombination with any of the other 
subsets, the actual number of recombinants generated is less than could be obtained with 
completely random recombination. 

For the initial four-tube protocol described, four separate libraries each containing four 
subsets are used: 

2 • 2 

Random recombination would generate, for a set of N clones, N recombinants, assuming N 
is less than or equal to the theoretical number of variants (20", see above) which can be 
encoded within the hypervariable segment, otherwise it will tend to 20". 

For the four-tube protocol 16 subsets are created each representing a pool within which 
recombination can take place. If the total the library consists of N clones then the number of 
novel recombinants which can be formed within each of the 16 subsets is (N/16)^. Summing 
for all sixteen subsets, the number of recombinants which can be generated is 16 x (N/16) = 

2 2 n 

again assuming N /16 is less than or equal to the theoretical number of variants (20 , 
see above) which can be encoded within the hypervariable segment, otherwise it will tend to 
20". 

For the single-tube protocol only 6 subsets are created, each representing a pool within which 
recombination can take place. If the total library consists of N clones then the number of novel 
recombinants which can be formed within each of the 6 subsets is (N/6)^. Summing for all six 
subsets, the number of recombinants which can be generated is 6 x (N/6)^= N^/6, again 
assuming N^/6 is less than or equal to the theoretical number of variants (20", see above) 
which can be encoded within the hypervariable segment, otherwise it will tend to 20". 

It is thus clear that the single-tube version of the invention is superior not only in terms of 
time and economy of the procedure but in the potential to generate a greater diversity fi'om a 
given number of clones during cosmix-plexing guided recombination. 



A specific embodiment of the invention concerns a bank of 

genes / wherein the six sets of oligonucleotide sequences 
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are present at a ratio of (i) : (ii) : (iii) : (iv) : (v) : (vi) of 
(0 to 1) : (0 to 1) : (0 to 1) : (0 to 1) : (0 to 1) : (0 to 1) with 
the proviso that at least one of said sets is present. 

Further, a specific embodiment of the invention concerns a bank 
of genes wherein each gene is provided as display vector, 
especially as M13 phage or M13 like phage or as phagemid- 

Further, a specific embodiment of the invention concerns a bank 
of genes wherein the double stranded DNA sequence is 
comprised by a DNA region (fusB) encoding a peptide or a 
protein to be displayed. 

Further, a specific embodiment of the invention concerns a bank 
of genes, characterized in that n~j = 6, a=si4 and b 
16. 

Further, a specific embodiment of the invention concerns a bank 
of genes wherein the restriction enzyme is a tyoe IIS 
restriction enzyme. 

Further, a specific embodiment of the invention concerns a bank 
of genes which is 
characterized in that 

(a) subsequence Bi...Bn is the recognition site for the 
restriction enzyme Bpml (CTGGAG) and subsequence 
Qn+a+b+1- • 'Qn+a+b+j inverted Bsgl recognition site 
(CTGCAC) ; or 

(b) subsequence B^. . .Bj^ is the recognition site for the 
re-striction enzyme Bsgl (GTGCAG) and subsequence 
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Qn+a+b+1- • -Qn+a+b+j is an inverted Bpml recognition site 
(CTCCAG) • 

Further, a specific embodiment of the invention concerns a bank 
of genes which is characterized in that the hypervariable 
sequence Xj^^^..,. X^^+a+b contains NNB or NNK wherein 
N = adenine (A) , cytoseine (C) , guanine (G) or thymine 
(T) ; 

B - cytosine (C) , guanine (G) or thymine (T) ; and 
K = guanine (G) or thymine (T) . 

Another embodiment of the invention concerns a phagemid 
pROCOS4/7 of the sequence shown in Fig. 6. 

Still another embodiment of the invention concerns a phagemid 
pROCOS5/3 of the sequence shown in Fig. 7. 

Another embodiment of the invention concerns a method for the 
production of large 

- phage-display libraries or 

- phagemid-display libraries, 

containing or consisting of optionally packaged recombined 
display vectors, wherein recombination takes place at the 
cleavage site(s) for a restriction enzyme (cut (B) enzyme; 
arrow in Pig. 3) and wherein 

(a) to (b) a double- stranded DNA prepared from Escherichia 
coli cells containing a display vector population, 
consisting of M13 phages or M13 like phages or consisting 
of phagemids according to the invention; a cosmid 



wo 98/33901 PCT/EP98/00533 

-21- 

vector? a restriction enzyme for cut (B) ; and a 
restriction enzyme for cut (A) are selected, wherein 

(i) the cut (B) enzyme cleaves the display vectors in the 
region encoding the displayed peptide or displayed protein 
(arrow in Fig. 3) and generates unique non- symmetrical 
cohesive ends, wherein each cohesive end is a 2 bp single 
strand end formed by the two bases designated Z, and 

(ii) the cut (A) enzyme cleaves the display vectors and 
the cosmid vector and generates upon cleavage unique non- 
symmetrical cohesive ends (fusA) which differ from those 
resulting from cut (B) , 

(c) the display vectors are cleaved with the first 
restriction enzyme, 

(d) the display vector and the cosmid vector are cleaved 
with the second restriction enzyme, 

(e) the cleaved display vectors are ligated with the 

cleaved cosmid vectors forming concatamers, 

(f) the ligation product is subjected to a lambda 
packaging and transduced into an Escherichia coli host, 

(g) if wanted, selection is made for a gene present in the 
ligated display vectors, 

(h) the transduced display vectors in the Escherichia coli 
host are 



- either in the case of a phage-display vector 
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spontaneously packaged in M13 or M13 like phage coats 



- or in the case of a phagemid-display vector packaged by 
infecting the Escherichia coli host with an M13 type 
helper phage (superinfection) , 

(i) the packaged display vectors are passaged in a fresh 
Escherichia coli host and phage -display or phagemid- 
display libraries are formed and, if wanted, 

(j) the passaged display vectors are 

- either in the case of a phage-display vector 
spontaneously packaged in M13 or M13 like phage coats 

- or in the case of a phagemid-display vector packaged by 
infecting the fresh Escherichia coli host with an M13 type 
helper phage (superinfection) and 

phage-display or phagemid-display libraries are formed, 

A specific embodiment of the invention concerns a method which 
is characterized in that in steps (a) to (b) a type IIS 
restriction enzyme is selected, preferably Bgll, Dralll, 
Bsgl or Bpml . 

Further, a specific embodiment of the invention concerns a 

method which is characterized in that for cuts (B) and (A) 
the same restriction and/or restriction enzyme is 
selected. 

Further, a specific embodiment of the invention concerns a 
method which is characterized in that as cut (B) enzyme 
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and as cut (A) enzyme different enzymes are used (Fig. 3) , 
preferably Bsgl or Bpml as cut (B) enzyme and Dralll as 
cut (A) enzyme (fd or M13 replication origin cut) . 

Further, a specific embodiment of the inven tion concerns a 
method which is characterized characterized in that in 
step (h) and facultatively in step (j) MISKO? is used as 
M13 type helper phage. 

Further, a specific embodiment of the invention concerns a 
method which is characterized in that characterized in 
that the ophagemid and the cosmid are identical and, 
further, presence of and cleavage with cut (A) enzyme is 
optional and/or cut (B) enzyme and cut (A) enzyme are 
identical . 

Further, a specific embodiment of the invention concerns a 
method which is characterized in that in step (i) the 
multiplicity Q,f Infection (MOD is less than or equal to 
1. 

Further, a specific embodiment of the invention concerns a 
method wherein the cosmid comprises an fd or M13 
bacteriophage origin (replication/packaging) • 

Further, a specific embodiment of the invention concerns a 

method wherein in step (e) a mol ratio of display vectors 
to the cosmid vector within the range of from 3:1 to 15:1 
and preferably 3:1 to 10:1 is used- 



Further, a specific embodiment of the invention concerns a 
method wherein in step (e) a vector concentration 
(comprising display vectors and cosmid vectors) of more 
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than 100 /ig DNA/ml is used. 

Another embodiment of the invention concerns a* method for the 
production of large 

- phage-display extension libraries or 

- phagemid-display extension libraries, wherein 

an oligonucleotide cassette of d bases in length is 
inserted into a restriction site {cut (B) ) via the 
cohesive ends ZZ as defined above to yield a sequence 
(supra sequence) or a gene comprising a double stranded 
DNA sequence which is represented by the following formula 
of one of their strands: 

-BnXji+i. •Xn+a2n+a+l2n+a+2Xn+a+3 • •Xn^.a+d^n+a+d+l^n+a+d 
+2Xn+a+d+3 • • Xn+a+d+bQn+a+d+b+1 • • Qn+a+d+b+ j ^ ' 

wherein d is an integer and a multiple of 3, preferably 
within the range of from 6 to 36; n, a., b and j and B, X, 
Z and Q have the same meaning as in any of the preceding 
claims; and wherein 

(a) to (b) a double -stranded DNA prepared from Escherichia 
coli cells containing a display vector population, 
consisting of M13 phages or M13 like phages or consisting 
of phagemids according to the invention; a cosmid vector; 
a restriction enzyme for cut (B) ; and a restriction enzyme 
for cut (A) are selected, wherein 

(i) the cut (B) enzyme cleaves the display vectors in the 
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region encoding the displayed peptide or displayed protein 
and generates unique non- symmetrical cohesive ends; 
wherein each cohesive end is a 2 bp single strand end 
formed by the two bases designated Z, 

(ii) the cut (A) enzyme cleaves the display vectors and the 
cosmid vector such that unique non- symmetrical cohesive 
ends are formed which differ from those resulting from cut 
(B) , 

(cl) the display vectors are cut with the cut (B) 
restriction enzyme, 

(c2) a DNA cassette is inserted into the cleavage site 
with their ZZ cohesive ends, 

(d) the resulting display vector and the cosmid vector are 
cleaved with the cut (A) restriction enzyme, 

(e) the cleaved display vectors are ligated with the 
cleaved cosmid vectors forming concatamers; 

(f ) the ligation product is subjected to a lambda packaging 
and transduced into an Escherichia coli host such that the 
DNA cassette lies betweeen two hypervariable sequences 
(extension sequences) , 

(g) if wanted, selection is made for a gene present in the 
ligated display vectors, 

(h) the transduced display vectors in the Escherichia coli 
host are 
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- either in the case of a phage -display vector 
spontaneously packaged in M13 or M13 like phage coats 

- or in the case of a phagemid-display vector packaged by 
infecting the Escherichia coli host with an M13 type 
helper phage (superinfection) , 

(i) the packaged display vectors are passaged in a fresh 

Escherichia coli host and phage -display or phagemid- 
display libraries are formed, and, if wanted, 

(j) the passaged display vectors are 

- either in the case of a phage -display vector 
spontaneously packaged in M13 or M13 like phage coats 

- or in the case of a phagemid-display vector packaged by 
infecting the fresh Escherichia coli host with M13 type 
helper phages (superinfection) and 

phage-display or phagemid-display extension libraries are 
formed. 

Another embodiment of the invention concerns a method for the 
reassortment of the 5'- and/or 3 ' -extensions in the 
production of large recombinant 

- phage-display extension libraries or 

- phagemid-display extension libraries, 

comprising the sequence as defined before wherein 
recombination takes place at one or the other, or 
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consecutively at both the cleavage site(s) ZZ bracketting 
the inserted cassette (s), wherein 

(a) to (b) a double- stranded DNA prepared from Escherichia 
coli cells containing a display vector population, 
consisting of M13 phages or M13-like phages or consisting 
of phagemids as display vectors as defined before; a 
cosmid vector; a restriction enzyme for cut (B) ; and 
restriction enzyme for cut (A) are selected, wherein 

(i) the cut (B) enzyme cleaves the display vectors in the 
region encoding the displayed peptide or displayed protein 
and generates unique non- symmetrical cohesive ends at 
selectively either 

- the 5 '-junction of extension and cassette (cleavage by the 
restriction enzyme recognizing the binding site Bi-.-Bj^ as 
defined before) or 

~ at the 3 '-junction of extension and cassette (cleavage by the 
restriction enzyme recognizing the binding site 
Qn+a+b+1- • -Qn+a+b+j defined before, or 
Qn+a+d+b+1 • • -Qn+a+d+b+j defined before), wherein each 
cohesive end is a 2 bp single strand end formed by the two 
bases designated Z, 

(ii) the cut (A) enzyme cleaves the display vectors and 
the cosmid vector and generates upon cleavage unique non- 
symmetrical cohesive ends which differ from those 
resulting from cut (B) , 

(b) the display vectors are cleaved with the first 
restriction enzyme, 
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(c) the display vector and the cosmid vector are cleaved 
with the second restriction enzyme, 

(e) the cleaved display vectors are ligated with the 
cleaved cosmid vectors forming concatemers, 

(f) the ligation product is subjected to a lambda 
packaging and transduced into an Escherichia coli host, 

(g) if wanted, selection is made for a gene present in the 
ligated display vectors, 

(h) the transduced display vectors in the Escherichia host 
are 

- either in the case of a phage-display vector 
spontaneously packaged in M13 or M13-like phage coats 

- or in the case of phagemid-display vectors packaged by 
infecting the Escherichia coli host with an M13-type 
helper bacteriophage (superinfection) , 

(i) the packaged display vectors are passaged in a fresh 
Escherichia coli host and phage-display or phagemid- 
display libraries are formed and, if wanted 

(j) the passaged display vectors are 

- either in the case of a phage-display vector 
spontaneously packaged in an M13 or M13-like phage coats 

- or in the case of a phagemid vector packaged by 
infecting the fresh Escherichia coli host with M13 type 



PCT/EP98/00533 

-29- 

(superinfection) and 

phage-display or phagemid- display libraries are formed, 

A specific embodiment of the invention concerns a method which 
is characterized in that in steps (a) to (b) a type lis 
restriction enzyme is selected, preferably Bgll, Dralll, 
Bsgl or Bpml . 

Further, a specific embodiment of the invention concerns a 
method which is characterized in that for cuts (i) and 
(ii) the same restriction site is selected. 

Further, a specific embodiment of the invention concerns a 
method which is characterized in that as cut (B) enzyme 
and as cut (A) enzyme different enzymes are used, 
preferably Bsgl or Bpml as cut (B) enzyme and Dralll as 
cut (A) enzyme (fd or M13 replication origin is cut) . 

Further, a specific embodiment of the invention concerns a 
method which is characterized in that in step (h) and 
facultatively in step (j) MISKO? is used as the M13-type 
helper phage. 

Further, a specific embodiment of the invention concerns a 
method which is characterized in that in step (g) 
selection is made for the presence of an antibiotic 
resistance gene. 

Further, a specific embodiment of the invention concerns a 
method which is characterized in that in step (i) the 
multiplicity of infection (MOI) is less than or equal to 
1- 
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Further, a specific embodiment of the invention concerns a 
method wherein the cosmid comprises an fd or M13 
bacteriophage origin. 

Further, a specific embodiment of the invention concerns a 

method wherein in step (e) a mol ratio of display vectors 
to the cosmid vector within the range 3:1 to 15:1 and 
preferably 3:1 to 10:1 is used. 

Further, a specific embodiment of the invention concerns a 
method wherein in step (e) a vector concentration 
(comprising display vectors and cosmid vectors) of more 

than 100 pig DNA/ml is used. 

Another embodiment of the invention concerns a method for the 
de novo production of large 

- phage-display libraries or 

- phagemid-display libraries, 

comprising .DNA sequences aa defined before, and 
subjectable to recombination according to a procedure as 
defined before, wherein recombination takes place within a 
DNA sequence as defined before, wherein 

a) a display vector, consisting of an M13 phage or M13- 
like phage or consisting of a phagemid- display vector 
comprising a bacteriophage replication origin, 
facultatively a gene for a selectable marker, preferably 
an antibiotic resistance, a lambda bacteriophage cos -site 
and a "stuffer" -sequence (Figure 5 upper right), 
containing two binding sites for a type lis restriction 
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enzyme different from any of the enzymes as defined before 
(cut (B) and cut (A) ) , wherein said two sites are oriented 
in divergent orientation and where the cohesive ends 
generated on cleavage are non- symmetrical and differ from 
one another at the two sites, and 

b} a PCR-generated fragment comprising part of one of the 
sequences as defined before, including a (the) 
hypervariable sequence (s) , preferably 
^n+1- •^n+a2n+a+ lZn+a+2Xn+a+3- -Xn+a+b according to the 
invention, bracketted by the same type lis restriction 
enzyme binding sites defined in (a) , but in this case both 
oriented inwards towards the hypervariable sequence 
(Figure 5 left side) and where on cleavage by this 
restriction enzyme two non- symmetrical , single strand ends 
. different from one another are generated, where the first 
end (a* in Pig. 5) is complementary to one of the ends (a 
in Pig. 5) generated on the large vector fragment in (a) 
and the second end (b* in Pig, 5) is complementary to the 
other end (b in Fig. 5) generated on the large vector 
fragment in (a) , 

c) the two cleavage reaction systems (a) and (b) still 
containing the active type lis restriction enzyme are 
mixed together in approximately equimolar proportions and 
subjected to ligation in the presence of DNA ligase; 

fragments containing the restriction enzyme binding sites are 
constantly removed ("stuffer" fragment and outer end of 
the PGR product) whereas 

the other two components, namely the large vector fragment and 
the insert sequence (central fragment from the PGR 
reaction) are driven to form 
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A) a concatameric hybrid if the ligation is carried out at > 

100 fig DNA/ml (Figure 5) , or 

B) a circular hybrid if the ligation is carried out at < or = 

40 /xg DNA/ml, 

dl) in the case of protocol A) the DNA is packaged into 
lambda particles and transduced into an Escherichia coli 
host, 

d2) in the case of protocol B) the DNA is transformed in 
an Escherichia coli host, 

e) if wanted, selection is made for a gene present in the 
ligated display vectors, 

f) the transduced display vectors in the Escherichia coli 
host are 

- either in the case of a phage-display vector 
spontaneously packaged in M13 or M13-like phage coats 

- or in the case of phagemid-display vectors packaged by 
infecting the Escherichia coli host with an Ml3-type 
helper bacteriophage (superinfection) , 

(g) the packaged display vectors are passaged in a fresh 
Escherichia coli host and phage-display or phagemid- 
display libraries are foinned and, if wanted 

(h) the passaged display vectors are 

- either in the case of a phage-display vector 
spontaneously packaged in an M13 or M13-like phage coats 
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- or in the case of a phagemid vector packaged by 
infecting the fresh Escherichia coli host with M13-type 
helper phages (superinfection) and 

phage "display or phagemid-display libraries are formed, 

A specific embodiment of the invention concerns a method which 
is characterized in that in steps (a) to (b) , as type lis 
restriction enzyme, preferably Bpil, Bsgl or Bpml is 
selected. 



Further, a specific embodiment of the invention concerns a 
method which is characterized in that in step (f) and 
facultatively in step (h) M13K07 is used as the M13-type 
helper phage. 

Further, a specific embodiment of the invention concerns a 
method which is characterized in that in step (e) 
selection is made for the presence of an antibiotic 
resistance gene. 

Further, a specific embodiment of the invention concerns a 
method which is characterized in that in step (g) the 
multiplicity of infection (MOI) is less than or equal to 
1. 



Another embodiment of the invention concerns a phage-display 
library or a phagemid-display library in the form of 
packaged particles obtainable according to any of the 
methods as described before. 



Another embodiment of the invention concerns a phage -display 
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library or a phagemid-display library in the form of 
display vectors comprised by Escherichia coli 
population (s) obtainable according to any bf the methods a 
s described before. 

Another embodiment of the invention concerns a phage -display 
libraries or phagemid libraries which are characterized by 
a gene (genes) as defined before and obtainable according 
to the invention, wherein the term "large" as used before 

is defined as in excess of 10^ variant clones, 

preferentially 10^ to lOH variant clones. 

Finally, another embodiment of the invention concerns a protein 
or peptide comprising a peptide sequence encoded by a DNA 
sequence as defined before and obtainable by affinity 
selection procedures on a defined target by means of a 
from libraries as defined before. 
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Detailed Description 

The invention pertains to a novel combination of recombinant 
DNA technologies to produce large hypervariable gene banks for 
the selection of novel ligands of pharmaceutical, diagnostic, 
biotechnological , veterinary, agricultural and biomedical 
importance with an efficiency higher than was hitherto 
attainable . 

The size of the hypervariable gene bank is presently considered 
the most essential factor limiting the usefullness of the 
methodology for such purposes, since, as an empirical method, 
it depends on the diversity (number of different variants) 
initially generated in the bank ( hypervariable gene library} . 
In contrast to this traditional opinion we consider that, when 
a highly efficient method is developed, as presented here, to 
generate a large proportion of the possible combinations of 
mutated segments of the variants from a preselected 
subpopulation, a population enriched for the desired structural 
elements will be generated which would only have been 
represented in a population approaching where N is the size 
of the original population and x is the number of segments to 
be recombined. 

The first part of the invention pertains to novel sequences 
which allow recombination within hypervariable DNA sequences 
encoding regions (domains) variable peptides or proteins 
displayed in combinatorial phage/phagemid display libraries 
using type lis restriction endonucleases both (a) to introduce 
a cut at the site of recombination and (b) to generate 
oriented substrates for a ligation reaction, where the ligation 
products are then recloned at high efficiency after in vitro 
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packaging in a lambda packaging mix. The entire protocol yields 
efficiencies (clones per input DNA) in excess of any described 
technology (>10^ clones per microgram ligated DNA). 

Combinations of (vector) sequences and protocols are claimed for 
both the production of the initial librairies and for 
recombinational procedures to generate increased diversity 
within the library or a selected subpopulation at any time. In 
particular such sequences and procedures are claimed for the 
generation and use of phage/phagemid-display combinatorial 
libraries . 

The inventors recognise that the main factor thereby determing 
the efficient generation of further variation is the efficient 
production of combinatorial libraries from the initial 
libraries, via reassortment of smaller elements (specific 
peptide sequences within the hypervariable region, and/or 
reassortment of structural domains ) which contribute to the 
properties selected for. The invention presents such a method, 
which has the unique property that the recombination site may 
be within the hypervariable region whereby no restriction is 
imposed on the sequence within the hypeirvariable region 
involved. Alternatively the method can be used to reassert 
domains of proteins or subunits of heteromeric proteins 
(proteins composed of two or more different variant polypeptide 
chains) , each of which can contain hypervariable regions, 
without resorting to recloning isolated DNA fragments or 
generating new libraries containing new synthetic 
oligonucleotides. It is noted that this method thus offers a 
saving in both time and materials when optimizing a structure 
for a predetermined property on the basis of a preselected 
clone population (subpopulation) and in view of the geometrical 
increase in possible variability offered may represent a 
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qualitatively novel feature in that some rare structures may be 
obtainable only by the novel strategy described. 

The method, we designate cosmix-plexing^ , is based on the 
design of the cloning vectors, the inserts used and a 
combination of special recombinant DNA protocols, which in 
particular use i) cleavage of the phage /phagemid DNA with type 
lis restriction enzymes, ii) subsequent ligation to 

concatamers which are iii) packaged in vitro with a lambda 
packaging system for iv) efficient transduction into E.coli 
strains, where they are then v) repackaged in vivo in 
filamentous phage coats. The use of cosmix-plexing , so 
defined, on a heterogeneous phage/phagemid population 
generates an enormous increase in novel variants at any time 
during further experimentation, e.g. after any enrichment step 
for structures having the predetermined property or 
properties. 

In particular subpopulations which are enriched from the 
original library for a specific property will be enriched for a 
consensus motif { a degenerate set of related sequences within 
the varied region (s) which all exhibit the required property to 
some extent) which may (probably will) include the optimal 
sequence in terms of the required property. Reassortment of 
these regions or portions of a single hypervariable sequence by 
cosmix-plexing*^ will increase the probability of obtaining the 
optimal sequence. The subpopulations may be isolated by 
differential affinity-based selection on a defined target, or 
enrichment procedures based on other desired selectable 
properties (example 1: substrate properties such as 
phosphorylation by a particular protein kinase enriched by 
binding on antibodies which recognise the modified ( in this 
case phosphorylated) substrate; or example 2: cleavage of the 
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variant sequence by an endoprotease , using selective release of 
the phage or phagemid previously bound via an interaction 
between a terminal protein structure (anchor) and its ligand 
immobilised to, or later trapped on, a surface) . 

The invention further covers the generation of extension 
libraries in which e.g. a "project-specific cassette" is 
inserted at the recombination site within the gene bank. 
Optimisation of ligands can then occur by the generation of 
further combinatorial libraries from selected clones in which 
the adjacent regions may be efficiently "shuffled", either 
singly or both at a time. As far as we are aware no other 
system provides this "cassette" insertion/exchange" feature. 
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Figure legends: 

Figure 1. Diagrammatic representation of the steps involved in 
creating recombination within the hypervariable regions of 
cosmix-plexing? libraries • 

Double -stranded phagemid from a number of clones (which may be 
a cosmid itself) and cosmid DNA{if the phagemid is not a 
cosmid) are cleaved with a type lis restriction enzyme 
(cleavage sites indicated by a small bar) within the 
hypervariable region and ligated together at high DNA 
concentration so that long concatemers of the DNA molecules are 
formed, which are all oriented in the same direction, e.g with 
respect to the M13 packaging origins, i.e. no palindromic 
regions are formed. The vectors contain one or more restriction 
site(s) for the type lis restriction enzyme such that no 
cohesive ends are formed which on ligation could form 
palindromic (i.e. head- to-head or tail-to-tail) structures. 
When the cohesive ends produced on cleavage by the restriction 
enzyme are themselves non-pal indromic and unique to each 
restriction site within each plasmid/phagemid, only ring 
closure and the formation of concatemers can be formed. At 
higher DNA concentrations (i.e. over 200 /xg/ml) concatemer 
formation will be preferred. A more detailed presentation of 
the molecular structures formed is given in Figures 2 and 3. 
The ligation product is added to an in vitro lambda packaging 
extract where the DNA is packaged into a lambda bacteriophage 
coat as a linear DNA of 37 to 50 kb cleaved at a lambda cos- 
site. In the following step, referred to as transduction, these 
particles carrying the cosmid-phagemid hybrid DNA are added to 
Escherichia coli cells (shown as large ellipses in the diagram) 
into which the DNA injects itself. In the cell it is 
circularized by closure of the cleaved cos-site using the 
endogenous DNA ligase. It is then propagated as a large cosmid- 
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phagemid hybrid, replicating from the plasmid DNA replication 
origin (s) . M13-type helper phage (e.g. M13K07) is added to 
these cells in the step referred to as superinfection. On entry 
of the helper phage single strand replication is initiated from 
the M13 replication origins present in the individual copies of 
the phagemid contained in the concatemer. During this process 
the phage are also packaged into M13 coats, and secreted into 
the medium. The phagemid can be harvested from the supernatant 
of the culture, A second passage, i.e. transduction into an 
E.coli host and repackaging by superinfection with helper phage 
is necessary before these phagemid are used in a selection 
procedure in order to ensure that a particular variant protein 
is presented only on the particle carrying the gene for that 
particular variant protein. It is noted that this is a highly 
efficient process in which a yield of more than 10 different 
phagemid can be produced pro microgram of ligated input DNA. 

Figure 2. The diagram illustrates the DNA structures formed 
when the cosmix-plexing? protocol is carried out as shown in 
Figure 1. Different variants are designated by different 
patterns for the whole plasmid. Initially double -stranded DNA 
is cleaved with a type lis restriction enzyme A. The ligation 
product is illustrated as a concatemer in which each phagemid 
is oriented in the same orientation. The products of 37 to 50kb 
introduced after in vitro lambda packing and introduction into 
the E.coli cells (shaded ellipses) are shown, whereby, for 
example 8 to 10 copies of a 4.5kb phagemid may be present per 
cell. On repackaging the same phagemid are obtained as were 
present before cleavage and ligation. The protocol as shown 
here in which the M13 -packaging/replication site and the 
restriction site for enzyme A are identical, is simply an 
efficient method of amplification when starting with double 
stranded DNA. 
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Pigure 3. The diagram illustrates a variant of the protocol 
illustrated in Figures 1 and 2 in which recombination is 
achieved between different phagemid variants. The cross -over 
point for the recombination is the cleavage site for the type 
lis restriction enzyme B ( shown as a hollow arrow) cleaving 
preferentially within a hypervariable region or between two 
different variable regions (see also Figure 4, where additional 
cleavage sites within other variable regions may be recombined 
simultaneously) . Again, as mentioned in the Figure 1 legend, 
each phagemid may be a cosmid itself, in which case the 
addition of another cosmid is unnecessary. In this example 
cleavage with the restriction enzyme A is optional. Although 
Figures 2 and 3 are almost identical it should be noted that 
the products of the scheme in Figure 3 are all recombined, i.e. 
hybrids of the two sides of different variants. Repassaging is 
needed before use in the recombined library for selection 
experiments for the same reasons discussed in the previous two 
Figures . 

Figure 4. Cosmix-plexing? strategies. 

The left part of the figure shows the hypervariable DNA 
sequences encoding the variable portion of the peptide or 
protein presented on the phage /phagemid. The four bars 
designated variants' show that there are different sequences 
on either side of the type lis restriction cleavage site. 
Phagemid DNA from the variant clones can be cleaved with the 
type lis restriction enzyme and religated to yield the 
indicated number of recombinant clones, within the limits of 
the cloning efficiency. If one starts with a subpopulation of 

4 

preenriched variants from the primary library (say 4x10 
clones) then one-sixteenth of all possible recombinants (10 ) 
can be obtained. 
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The construction of "extension libraries" is shown below the 
dotted line. In this case a project-specif ic-cassette 
containing a biased codon distribution encoding some sequence 
elements previously defined as advantageous for binding to the 
target is inserted into the hypervariable sequence at the type 
lis restriction cleavage site. The large library thus generated 
encodes a protein containing three segments (domains B, A and 
C) , whereby the central domain A is encoded by the project- 
specific cassette, and is bordered by the hypervariable domains 
B and C. 

The formuli for the numbers of variants obtained are made for 
the protocol in which four separate libraries are constructed. 

The right side of the figure illustrates how the variant 
protein might bind to a target protein. The variants selected 
from the extension library are expected to have a larger 
surface of interaction and thus to exhibit stronger and /or 
more specific binding to the defined target. The target may be 
a cell, a (partially) purified protein or peptide e.g. enzyme, 
antibody, hormone or lymphokine, cell receptor or in fact- any 
defined surface or particle suspension, possible coated with 
one of the aforementioned targets, which is amenable to 
physical separation, i.e. the wall of a receptacle (tube, 
tubing, flask, microtiter plate, a planar surface), or a 
particle ( e.g. beads, magnetic beads, or droplets in a two- 
phase liquid system) . 

Figure 5: Driven directed cloning (DDC) 

This figure illustrates an example of a cloning protocol which 
has excellent properties for the highly efficient construction 
of hypervariable libraries and extension libraries, which can 
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be used with the cosmix-plexing? method. The left side of 
figure shows the preparation of the hypervariable cassette 



be inserted into the cosmid-phagemidV vector. The cosmid- 
phagetnid vector containing a "stuff er fragment" is shown on the 
right. Both the PCR-product containing the hypervariable 
sequence, shown as a line of asterisks, and the vector 
containing the "stuffer" are cleaved with the same type lis 
restriction enzyme (s) . It is noted that the recognition sites 
for this (these) enzyme (s) are oriented in opposite directions, 
i.e. outwards from the stuff er in the case of the vector, and 
inwards in the case of the PCR-product. After cleavage neither 
the hypervariable cassette to be inserted nor the vector 
contain any of the original type lis restriction enzyme 
recognition sites. The vectors and insert are, however designed 
to have non- palindromic cohesive ends at their termini, 
generated by the restriction enzyme cleavage, so that a 
ligation of insert and vector leads to an oriented insertion of 
the hypervariable region. In addition, the vector cannot 
undergo ring closure in the absence of the insert cassette nor 
can the insert fragments ligate to one another. Since the 
ligation is carried out at high DNA concentration and in the 
continued presence of the restriction enzyme any ligation 
product resembling the in initial uncleaved or partially 
cleaved vector or PCR-product will ' be immediately recleaved. 
This combination of oriented non-pal indromic cohesive ends and 
recleavage of unwanted ligation products, drives, especially at 
high DNA concentration^ where the formation of ring closure of a 
vector- insert -hybrid is at a disadvantage, the formation of 



efficient cosmid packaging. The primary cosmix-plexing library 
is formed finally by transducing the packaged cosmid-phagemid 
hybrids into an E.coli host which contains, or is superinf ected 
with, an M13-like helper phage. The phagemid are repassaged in 





structure required for highly 
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a second M13 phage -packaging step before use in selection so 
that individual phage clones are derived from singly infected 
cells. This is necessary in order that each phagemid particle 
carries the variant encoded in its genome. This is not the 
situation in the first packaging step in which the E.coli host 
contains a concatemer of some eight different variant phagemid. 

Recombination can be acheived within the hypervariable region 
of the gene encoding the protein or peptide presented on the 
phagemid according to the scheme illustrated in Figure 1. With 
extension libraries, either the left (5') or right (3-) 
extension, or both, can be reasserted by cleaving with a type 
lis restriction enzyme recognizing a site bordering either left 
end, the right end (opposite orientation) , or both ends 
respectively, as described for the sequences Bi-B^ and 

Qn*a+l Qni-a+i claim 3. 

The use of hypervariable sequences in the description of the 
invention implies in general that we try to use set of 
oligonucleotides in which „ randomized sequences" encode amino- 
acids at ratios near to that normally found in natural 
proteins, whereby the frequency of stop-codons is reduced. We 
are aware that for certain applications biased subsets may be 
preferable in the construction of dedicated sublibraries . 
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Example 1 

Cosmbqplexing using the four-tube-method (according to claims 1-10) 

la) Library generation 

Oligonucleotide Sequences: 
NONA-CA: 

5' TCGG GGTACC rGGAGCA(?CN>r).KKNfXNlNnaGCTGCACGG GAGCTC GCC 3* 
Kpnl Sad 

NONA-CT: 

5' TCGGGGTACCrGGAGCA(XNN)4RRN(XNN)4GCTGCACGGGAGCTCGCC 3* 
NONA-GA: 

5' TCGGGGTACCTGGAGCA(XNK)4YYNCXNN)4GCTGCACGGGAGCTCGCC 3* 
NONA-GT: 

5' TCGGGGTACCTGGAGCA(XNN)4MMN(XNN)4GCTGCACGGGAGCTCGCC 3' 

where X means: A, C and G; N: A, C, G and T; K: G and T; R: G and A; Y: C and T; M: C and A. 

NONAPCR-L: 

5' GGCGAGCTCCCGTGCAGC 3' 
NONAPCR-R: 

5' TCGGGGTACCTGGAGCA 3' 

Kpnl (GGTACC) and Sad (GAGCTC) restriction enz>me recognition sites are marked in bold type. 

important vector DNA-Sequences: 
PROCOS4/7: 

Eco47m 

Sad Bsel Kuril \pIII^ 

5' GGCGAGCTCCCGTGCAGCG CTCCAG GTACCCCGATATCAGAGCTGAA 3' 

Bpml 

pROCOS4/7-Stufferl: 

Eco47m Eco4mi 
Sad Bspl Kuril 

IpJJI-^ 

5' GGCGAGCTCCCGTGCAGC(?Cr. /IGC GCTCCAG GTACCCCGATATCAGAGCTGAA 3' 

It Bpml 
952 bp Eco47m fragment 
ofplasmidp6R322 

Kpnl (GGTACC), Sad (GAGCTC), Bsgl (GTGCAG), Eco4mi (AGCGCT) and Bpml (CTGGAG) restriction 
enzyme recognition sites are marked in bold tj^je. The first codon of the mature pIII protein (GAA) is 
indicated. 



For the generation of double-stranded DNA inserts the single-stranded hypervariable DNA 
oligos NONA-CA, NONA-CT, NONA-GA and NONA-GT are amplified usmg the single 
stranded DNA oligos NONA PCR-L and NONA PCR-R as PCR-primers according to the 
following protocol: 

Remark: the four hypervariable DNA-oligos have to be kept strictly separated! 



PCR-Amplification of DNA Oiigos 
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PCR.buffer(10X): 
KCl 5OO1T1M 
Tris-HCl (pH 9.0) 100 mM 
Triton X-100 1% 

Taq DNA polymerase (Promega) 
in storage buffer A: 

glycerol 50 % 

Tris-HCl (pH 8.0) 50 niM 

NaCl 100 mM 

EDTA 0.1 mM 

DTT ImM 

Triton X- 100 1% 

TE-bufifer(lX) 

Tris-HCl (pH 8.0) 10 mM 

EDTA 0.1 mM 

1, Transfer 2 )il of a 10 pmol/^1 solution of the hypervariable oligos NONA-CA, -CT, -GA and -GT in a 0.2 
ml PCR reaction tube (4 tubes). 

2. Mix the following in one Eppendorf reaction tube: 



ddH.O 276,75 ^1 

PCR-buffer(lOX) 45.0 ^1 

NONA PCR-L (100 pmol/^1) 9.0 \il 

NONA PCR-R (100 pmol/iil) 9.0 ^1 

dNTPs(lOmMeach) 9.0 \il 

Taq DNA polymerase (5 U/|il) 2.25 jil 



3. Transfer 78 fil of this mixture to each of the PCR tubes containing the hypervariable oligos (step 1). 

4. Mix 45 |il MgCU (25 mM) and 45 ^1 ddHoO in an Eppendorf reaction tube. 

5. Preheat a PCR thermocycler to 94 °C (if possible use a heated lid). 

6. Transfer 20 ^1 of the MgCb solution (step 4) into each of the PCR tubes (step 3). 

7. Put the tubes directly into the thermocycler (simplified hot-start) and run the following program: 

1. 94 °C 30 sec 

2. 94 °C 10 sec 

3. 52 °C 10 sec 

4. repeat 9 times step 2 and 3 

5. holdat4*^C 

8. Take an aliquot of 5 }il to run a 4.5 % ag^ose gel. 

9. Add 200 |il ddH20 to each tube, extract with phenol, precipitate with ethanol and resuspend the DNA in 120 
III TE-buffer. 

For cloning tlie amplified oligo-DNA are cut with Kpnl and Sad Also the vector-DNA has to 
be cut with both enzymes. As vector-DNA pROCOS4/7 or a derivative thereof named 
pROCOS4/7-Stufferl which contains a DNA-Stu£fer JBragment for easier control of the double 
digest reaction can be used without any consequences regarding the final cloning resuks. 
Digestions are done according to the following protocols: 

buffer B + TX-100 (IX) 
Tris-HCl (pH 7.5) 10 niM 
MgC12 lOniM 
BSA O.lmg/ml 
Triton X-lOO 0.02 % 

buffer A (IX) 

Tris-acetate (pH 7.9)33 niM 
Mg-acetate 10 niM 

K-acetate 66 niM 

Dithiothreitol 0.5 mM 
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Vector DNA Digestion 

1 . For the restriction digestion of the vector DNA with Kpnl set up the following mixture: 

pROCOS4/7-Stuflferl X nl (200 \xg) 

buffer B + TX-lOO (lOX) 150 ^1 

BSA (10 mg/ml « lOOX) 15 \i\ 

Kpnl X^1(400U) 

ddHaO to 1500 Hi 

incubate at 37 °C for 3 hr and stop the reaction by incubating at 65°C for 20 min. 

2. Take an aliquot of 3 ^1 and run a 1 % agarose gel with uncleaved DNA as a control. 

3. Extract with phenol, precipitate with ethanol and resuspend the DNA in 820 \xi TE-buffer. 

4. Store a 20 ^1 aliquot of the digested DNA at - 20 °C and mix the following for the digestion with Sad: 

pROCOS4/7-Stuffer//:p«I 800 \ii 

buffer A (lOX) 100^1 
Sac l X \il (400 U) 

ddHaO to 1000 jxl 

incubate at 37 °C for 3 hr. 

5. Take an aliquot of 3 \x\ and nm a 1 % agarose gel using uncleaved and single-cut DNA as a control. 

6. Extract with phenol, precipitate with ethanol and resuspend the DNA in 550 ^1 TE-buflfer. 

Oligo DNA Digestion 

1. For the digestion of double-stranded (ds) oligo DNA with Kpnl set up the following four mixtures: 

NONA-CA, -CT, -GA or -GT dsDNA 100 ^1 

bueerB + TX-100(10X) 50 pi 

BSA (10 mg/ml =100X) 5^1 

Kpnl Xh1(400U) 

ddH^O to 500 pi 



incubate at 37 °C for 5 hr. 

NOTE: Don't heat up the oligo DNA. 

2. Take an aliquot of 5 jal and run an 4.5 % agarose gel with uncleaved DNA as a control. 

3. Extract with phenol, precipitate with ethanol and resuspend the DNA in 110 pi TE-buffer, 

4. Store an 10 |il aliquot of the digested DNA at - 20 and set up the following four mixttires for the digestion 
with Sacl: 



NONA-CA, -CT, -GA or -GT/Kpnl 100 pi 

buffer A (lOX) 50 pi 

Sacl Xpl(400U) 

ddlljO to 500 pi 



incubate at 37 °C for 5 hr. 

5, Take an aliquot of 5 pi and nm an 4.5 % agarose gel using uncleaved and single-cut DNA as a control. 

6. Extract with phenol, precipitate with ethanol and resuspend the DNA in 55 pi TE-buffer. 



Tlie vector-DNA fragment may be purified using the following protocol: 
Purification of Vector DNA Fragments by Gel Extraction 

1. To separate the pROCOS4/7 vector DNA fragment from the stuffer fragment prepare a horizontal 1 % 
agarose gel using a one-tooth combs. 

2. Mix the DNA with 1/10 vol gel loading buffer, load onto the gel and electrophorese at 100 V until both 
fragments are clearly separated. 

3. Put the gel on the UV transilluminator and excise the 5.5 kb pROCOS4/7 vector DNA fragment. 

4. Extract the agarose slice using the JETsorb gel extraction kit (Genomed GmbH, Germany). 
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Vector- and insert DNA fragments are ligated and transformed according to the followmg 
protocols: 

Ligation of DNA Fragments 

Check the integrity of vector and insert DNA fragments by agarose gel electrophoresis (1 % and 4.5 % 
respectively). The concentration of the insert DNA may be estimated by comparing its ethidium bromide 
staining with standards of known quantity like assembled oligonucleotides. To determine the vector DNA 
concentration determine the absorbance at 260/280 nm. 

T4 DNA iigase buffer (IX): 
Tris-HCl (pH 7.5) 50 mM 

MgCh 10 mM 

Dithiotreitol lOmM 
ATP 1 mM 

BSA 25 ^ig/ml 

Test Ligation 

1. To determine the appropriate ratio of insert to vector DNA a series of test ligations may be performed. For 
this assemble ligation reactions composed of: 



2. Prepare three twofold dilutions of the insert DNA's in ddHiO and add 1 ^1 of undiluted DNA as well as 1 f^l 
of each dilution to one of the ligations reactions. 

NOTE: The aim of this is to create vector to insert DNA (V/I) ratios of 1:5 to 2:1. 

3. Add 1 unit T4 DNA ligase to each reaction and incubate overniglit at 15 ^'C. 

NOTE: As a control one reaction without insert DNA and one without ligase should be included. 

4. Add 1 vol ddHoO to each reaction and incubate at 65 °C for 10 min. 

5. Precipitate the DNA with ethanol and resuspend it in 10 ^1 TE bujffer. 

6. Transform electrocompetent E. coU JMllOX cells with the content of each tube and plate dilutions on 
ampicillin containing L5 agar plates, 

Large-Scale Ligntion 

1, To create the libraries set up four of the following mixtures: 



incubate overnight at 15 °C. 

2, Extract with phenol, precipitate with ethanol and resuspend each of the ligation mixtures in sufficient TE- 
buffer to adjust the DNA concentration to 0.1-0.2|ig/|il. 

Preparation of Competent Cells 

1. Inoculate 20 ml of LB medium with a single colony of E, coU JMllOX and incubate at 37°C and 180 rpm 
overnight. 

2. Next day inoculate 2x1 liter of LB medium {2 x 21 Erlenmeyer flask) at 1 % with the overnight grown 
culture and incubate again at same conditions until an optical density of OD^qo ^-^ has been reached. 

3. Transfer 250 ml aliquots of the culture into centrifuge tubes (GS3), chill the cells on ice and centrifuge for 
15 min at 8000 rpm and 4 °C (Sorvall RC5C centrifuge; GS3 rotor). Decant the supernatant. 

4. Resuspend each pellet in 250 ml of ice-cold ddH20, centrifiige again (step 3) and decant the supernatant. 



vector DNA fragment 

T4 DNA ligase buffer (lOX) 

ddHiO 



X^l(0.5ug) 
l^il 
to 9^1 



vector DNA fragment 
insert DNA 

T4 DNA ligase buffer (lOX) 

T4 DNA ligase 

ddHaO 



X ^1 (1/4 of the total prep.) 
X jil (to create the optimal V/I-ratio) 
X III (1/10 ofthe final vol) 
Xtil(2U/^gDNA) 
to create a DNA cone, of 0.05 ^ig/|il 
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5. Resuspend each pellet in 125 ml of ice-cold ddH^O, collect each of two aliquots in one tube, centrifuge again 
{step 3) and decant the supernatant. 

6. Resuspend each pellet in 10 ml of ice-cold sterile glycerol (10 %), collect all of the aliquots in one GSA 
centriftige tube, centrifuge for 15 min at 8000 rpm and aspirate the supernatant. 

7. Resuspend the bacterial pellet in 10 ml of ice-cold sterile glycerol (10 %). 

8. Fill aliquots of 100 |il in precooled, sterile Eppendorf reaction tubes, freeze immediately in liquid nitrogen 
and store at -70 °C. 

Transformation of K coU Cells by Electroporation 

1. Place freezed aliquots of competent E. coli cells on ice and let them thaw. 

2. To each aliquot add up to 2 ^ig DNA in less than 10 p.1 and incubate on ice for 1 minute. 

3. Fill the suspension in a prechilled electroporation cuvette (0.2 cm pathlength), place the cuvette in the 
electroporation sled and give a pulse at a voltage of 2.5 kV, a capacity of 25 |.iF and a resistence of 200 CI (Gene 

Pulser and Puis Controller, Bio-Rad). 

4. Immediately add 1 ml of LB medium (supplemented with 20 niM Glucose), mix and transfer the suspension 
in an Eppendorf reaction tube. 

5. Incubate for 1 hour at 37 "^C and plate on LB agar plates containing ampicillin (100 \xgj ml). Incubate 
overniglit at 37 °C, 

NOTE: To determine the size of the libraries also plate dilutions of the transformed cells. 

6. To create library stocks resuspend the cells in LB/ampicillin medium, mix with 1 vol of sterile 87% glycerol 
and store at -70 °C. 



lb) Recombination 

For recombinatiou within the hypemriable Sequences according to the four tube 
cosmixplexing method the libraries can be preselected. For this purpose the E. coli cells 
containing the phagemid Ubraries are superinfected with M13K07 helperphages, progeny 
phages presenthig fusionproteins are harvested and used for the first round of a panning 
according to standard methods e.g.: 

Preparation of M13K07 Phage Stocks 

PEG/NaCl-solution: 
(16.7%/3.3M) 

lOOgPEG 8000 
116,9gNaCl 

475 ml H,0 

PBS-buffer(lX): 
S.OgNaCl 
0.2gKCl 

1.43gNa^HPO^+2H20 
0.2gKH^PO^, 
H,Oadll 
pH 6.8 • 7 

L Use a disposable pasteur pipette to pick a single, well separated M13K07 plaque from zE. coli WK6 lawn 
grown overniglu on a LB/kanamycin (Km) plate, inoculate 20 mi of LB(2X)/Km medium (100 ml Erlenmeyer 
flask) with this agar slice and incubate overday at 37 °C on a shaker at 180 rpm. 

2. Inoculate 2 x 500 ml LB(2X)/Km medium (in 2 1 Erlenmeyer flasks) with 10 ml preculture and incubate 
overnight (37 *C, 180 rpm). 

3. Ne.xt day centrifuge four 250 ml aliquots for 15 minutes at 8,000 rpm and 4 °C (Sorvall RC5C centrifuge; 
GS3 rotor). Transfer the supernatant into centrifuge bottles, centriftige and transfer the supernatant again into 
fresh centrifuge bottles. 

4. Add 0. 1 5 vol of PEG/NaCl solution, mix and incubate on ice for at least 2 hours. 

5. Centrifuge for 60 min at 8000 rpm (GS3 rotor), decant the supernatant, centrifuge for some sec at up to 4000 
rpm and remove last traces of the supernatant using a pipette. 
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6. Resuspend each PEG-pellet in 2,5 ml PBS solution and collect the resuspended phages in one SS34 
centrifuge bottle. To clear the suspension centrifuge again for 10 min at 12000 rpm (SS34 rotor). Recover the 
supernatant (pipette), add NaN3 to a final concentration of 0.02 % and store the phages at 4 °C. 

Packaging of Phagemids (keep each library separate!) 

1. Inoculate 100 ml of LB/Anip medium (1 I Erienmeyer flask) with 1 ml of £. coli JMilOX cells containing 
phagemids (from overnight culture or resuspended cells) and incubate at 37 *C and 180 rpm until ODgQO 

2.5 h). 

2. Add 500 ^il M13K07 stock solution (10^ ^ - 10^^ cfu/ml), incubate at 37 °C for 15 min and continue shaking 
at 37 °C and 180 rpm overniglit. 

3. Next day centrifuge for 10 min at 8000 rpm (GSA rotor), decant the supernatant into a fresh bottle and 
repeat the centrifiigation step. 

4. Add 0. 15 Vol of PEG/NaCl solution and incubate on ice for at least two hours. 

5. Centrifiige for 60 min at 10000 rpm (GSA rotor), decant the supernatant and repeat the centrifiigation and 
remove the supernatant completely. 

6. Dissolve the pellet in 1 ml of PBS buffer and transfer the solution into an Eppendorf reaction tube. 
Centrifuge for 10 min at 13000 rpm (batch centrifuge), recover the cleared solution and add NaN3 (final 

concentration of 0.02 %). Store at 4 

Panning Procedure (keep each library separate!) 

T-PBS solution: 

PBS-buffer containing 0.5 % Tween 20 
Blocking solution: 

PBS-buffer containing 2 % skim milk powder 

Elution-buffer: 
glycine (0.1 M;pH 2.2) 

1 . Coating of Microtiter Plates: 

Fill 100 |.il of ligand solution (100 f.ig/ml PBS) into the wells of a 96-well microtiter plate (Nunc maxisorb) and 
incubate overnight at 4 ^C or at least 2 hours at room temperature. 

Shake out the wells, slap the plate onto a paper towel and wash the wells once with T-PBS solution (ELISA 
plate washer or manually), 

2. Blocking: 

Fill the wells with 400 ^1 of blocking solution and incubate at room temperature for 1 hour. Shake out the 
wells, slap the plate onto a paper towel and wash the wells once with T-PBS. 

3. Binding: 

Fill the coated and one uncoated well (as a control) with 100 of phage preparations diluted 1:1 with skim 

milk powder (usually -10^^-10^^ phages/well) and incubate at room temperature for 1 to 3 hours. 

4. Washing: 

Remove the solutions using a pipette and slap the plate onto a paper towel. 

In the first round of panning wash the wells once with T-PBS, incubate for 10 min with 400 \x\ blocking 
solution, wash again with T-PBS and finally two times with water. During all further rounds repeat the T-PBS 
washing steps three times. All washing steps can be carried out manually using a pipette or with an ELISA 
plate washer. 

5. Elution: 

Slap out the plate and fill the wells with 100 |.il of elution-buffer, incubate at room temperature for 15 min and 

transfer the solution into an Eppendorf reaction tube containing 6 \x\ Tris (2 M). 

6. Determine the titer of eluted phages as described under 3.1.3. 

Reinfection of K coli Cells (keep each library separate!) 

1. Mix the eluted phages and 10 ml ofE, coli JMl lOX log-phase cells and incubate for 30 min at 37 **C. 

2. Coliect the cells by centrifiigation (5 min, 8000 rpm, SS34 rotor) and resuspend the pellet in 400 ^il of 
LB/ Amp medium. 

3. Plate each suspension on one LB/ Amp agar plate (0 14,5 cm) and incubate overniglit at 37 °C. 
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After one roiiud of panning populations of about 10^ individual clones enriched towards 
binding one's are expected. For recombination the phagemid DNA has to isolated according 
to standard protocols, e.g.: 

Preparation of Phagemid-DNA from Reinfected Cells 

1. Resuspend reinfected E. coli cells in 20 ml of LB/Anip medium and use 200 ^il for the inoculation of 3 ml 
LB/Aiup medium. 

2. Incubate at 180 rpm and 37 °C for 1 hour. 

3. Prepare the DNA using Jetquick Piasmid Miniprep Spin Kits (Genomed GmbH, Germany) according to the 
instructions of the supplier, 

Usmg this method up to 30 ^ig of DNA can be isolated. For pROCOS4/7 based Ubraries the 
phagemid size is 4.3 kb corresponding to a molecular weight of 2,9x 10^ g/mol or round about 
2x10^^ phagemid molecules/fig DNA. Therefore 10 fig of recombined DNA contains more 
molecules than the theoretical number of different variants that can be created from 10^ clones 
((10^)^=10^^). 

For recombination the phagemid DNA of each preselected library is cut separately e.g. with 
Bpml or alternatively with Bsgl: 



Digestion of the phagemid DNA 

NEB3-buffer(lX) 

NaCl 100 mM 

Tris-HCl (pH 7.9) 50 mM 

MgCb 10 mM 

Dithiothreitol 1 raM 

1 . Set up the following reaction 
phagemid-DNA lO^ig 
Bpml (2u/fil) 5^1 
NEB3(10X) 4^1 
BSA{lnig/ml) 4^1 
H20 up to 40 \x\ 
incubate at 37 °C for 5 hr. 

2. Take an aliquots of 4 ^1 and mn a 1 % agarose gel to check the digestion. 

3. Extract with phenol, precipitate with ethanoi and resuspend the DNA in TE-buffer, 

Digested phagemid-DNA' s are rehgated at liigh concentration (> 0.2 |iig/^l) to favour 
fonnation of concatemers, packaged into X phage particles and used for the transfection of £. 
coli cells (according to ,J*ackaging of Bacteriophage X DNA in vitro; protocol I" p.2. 100- 
2, 104, in: Molecular Cloning- a Laboratoiy Manual, Sambroock et al (eds.). 2. ed., 1989, 
Cold Spiing Harbour Laboratoiy Press), Transfected phagemids are separated by packagmg 
reinfection using M13K07 helperphages (see above). 



wo 98/33901 PCT/EP98/00533 

.52- 

Example 2 

Cosmlxplexing using the one-tube-method (according to claims 1 1-17 and 44) 

2a) Library generation 
Oligonucleotide sequences: 

NONACOS-NGG: Bpil Bs2 1 

r GGCTCTGATGGAAGACGTiGCAGC(NNB)4NGGns[NB)4TG Ci^ T 

t Bpmlt Bpil 

NONACOS-NCT: 

5' GGCTCTGATGGAAGACGTGCAGC(hn^)4NCT(NNB)4TGCTCCAGAGTCTTC^ 3* 

NONACOS-NAM: 

5' GGCTCTGATGGAAGACGTGCAGC(W^)4NAM(NNB)4TGCTCCAGAGTCTTC 3' 
NONACOS-NTS: 

5' GGCTCTGATGGAAGACGTGCAGC(>^^re)4NTS(NNB)4TGCTCCAGAGTCT 3* 

where N means: A, C, G or T; B: C, G or T; M: A or C and S: C, G or T. 

NONACOS-PCR-L: Bpil Bsgl 

5' GGCTCTG ATGGAAGACGTGCAG 3* 

NONACOS-PCR-R: Bpil Bpml 
5' CGACAGGAGGAAGACrCTGGAG3' 

Bpil (G AAGAC), Bsgl (GTGCAG) and Bpml (CTCCAG) restriction enzyme recognition sites are marked in 
bold type. Bpil cutting sites are marked by arrows. 

important vector DNA-sequences of pROCOS5/3: 

Bsffl Eco47lll 
5' GGCGAGCTCCCGT^ GCAG CGGTCTTCAGCGCTTGCCGTCTGACCGT 

t Bpil 

Eco47lll Boil \pm^ 
AGCGCTGGAAGAC GCiTCCAG AGGGTACCCCGATATCAGAGCTGAA3^ 

Bpml t 

Bpi\ (GAAG AC), Bsgl (GTGCAG) Eco4mi (AGCGCT) and Bpml (CTCCAG) restriction enzyme recognition 
sites are marked in bold type. Bpil cutting sites are marked by arrows. The first codon of the mature plll- 
Protein (GAA) is also indicated. 

To create libraries according to the one-tube method the hjpervariable oHgos NONACOS- 
NGG, -NCT, -NAM and -NTS are amplified using the PCR-primer NONACOS-R and 
NONACOS-L as described in example 1, except that the oHgo-DNA's don't have to be kept 
separate. 

After this pROCOS5/3-vector-DNA and double stranded (ds) oligo-DNA are digested with 
Bpil and ligated at the same time according to the following Protocol: 

Digestion/Ligation 

L Set up the following mixture: 

pROCOS5/3 DNA 200 ^lg 

NONACOS-NGG,-NCT,-NAMand-KrSdsDNA 100^1 
Bpi\ 200 u 

buffer G(IOx) 40^x1 
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BSA(10mg/ml) 
H2O 



4^1 

up to 400 ^l 



incubate at 37 °C for 2 hr, add 200 units T4 DNA ligase and continue the incubation at 15 to 30 °C over niglit. 
2, Take an aliquot of 3 \il and run an 1 % agarose gel as a control. 

This protocol favours the production of concatemers of the desired product, that can be 
packaged for example in E, coli JMllOX cells by X-packaging accordmg to example L 

2b) Recombination 

For panning and recombination the same methods as described for example 1 can be used, 
except that one Ubrary is used instead of four separate Ubraries. 
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1. A bank of genes, wherein said genes comprise a double 
stranded DNA sequence which is represented by the 
following formula of one of their strands: 

5*BiB2B3 . • •BnXn+i- . •Xn+aZn+a+l2^n+a+2Xn+a+3 • • •Xn+a+bQn+a+b+ 
1 • • *Qn+a+b-f ' 

wherein n, a, b and j are integers and 
n>3, a>l, b>3 and j > 1, 

wherein X^+i . . .Xj^^^^+b is a hypervariable sequence and B, 
X, Z and Q represent adenine (A) , cytosine (C) , guanine 
(G) or thymine (T) , 

(i) Z represents G or T at a G:T ratio of about 1:1, 
and/ or 

(ii) Z represents C or T at a C:T ratio of about 1:1, 
and/or 

(iii) Z represents A or G at a A: G ratio of about 1:1, 
and/or 

(iv) Z represents A or C at a A: C ratio of about 1:1, and 
wherein 

subsequences Bi...Bn and/or Qn+a+b+1 • • •Qn+a+b+ j represent 
recognition sites for restriction enzymes, and wherein the 
recognition sites are orientated such that their cleavage 
site upon cleavage generates a cohesive end including the 
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two bases designated Z. 

A bank of genes according to claim 1, wherein subsequences 
Bi,..Bn or Qn+a+b+1 • - Qn+a+b+j represent recognition sites 
for restriction enzymes and wherein the recognition sites 
are orientated such that their cleavage site upon cleavage 
generates a cohesive end including the two bases 
designated Z. 

A bank of genes according to claim 1 or 2, wherein the 
cohesive end is a 2 bp single strand end formed by the two 
bases designated Z. 

A bank of genes according to any of claims 1 to 3 , wherein 
each gene is provided as display vector, especially as M13 
phage or M13 like phage or as phagemid, 

A set of four gene banks according to any of the preceding 
claims, .wherein the gene banks are characterized as 
follows : 

- first gene bank: Z represents G or T, preferentially at 
a 

G:T ratio of about 1:1; 

- second gene bank: Z represents C or T, preferantially at 
a 

. C:T ratio of about 1:1; 

- third gene bank: Z represents A or G, preferentially at 
a 

A:G ratio of about 1:1; and 

- fourth gene bank: Z represents A or C, preferentially at 
a 

A:C ratio of about 1:1., 
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A set of four gene banks according to claim 5, wherein each 
gene is provided as display vector, especially as M13 
phage or M13 like phage or as phagemid. 

A bank of genes, wherein said genes comprise a double 
stranded DNA sequence which is represented by the 
following formula of one of their strands: 

5 '813263 . . .BnXn+i. . .Xn4.aZn+a+l2n+a+2Xn+a+3 • • •Xn+a+bQn+a+b+ 
1 • • -Qn+a+b+j-^ ' 

wherein n, a, b and j are integers and 
n>3, a>l, b>3 and j > 1, 

wherein Xn+i...Xn+a+b is a hypervariable sequence and B, 

X, Z and Q represent adenine (A) , cytosine (C) , guanine 
(G) or thymine (T) , and wherein 

four sets of oligonucleotide sequences comprising Zn+a+1 
and Zn+a+2 are present, preferentially at a ratio of 
(i) : (ii) : (iii) : (iv) of about 1:1:2:2, wherein the four 
sets are characterized as follows: 

first set: Zn+a+1 represents G and Zn+a+2 also represents 
G; 

second set: Zn+a+l represents C and Zn+a+2 represents T; 



third set: Zn+a+1 represents A and Zn+a+2 represents A 
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C, preferentially at A:C ratio of about 1:1; and 

fourth set: Zj^+^+l represents T and 2^+^+2 represents C or 
G, preferentially at a C:G ratio of about 1:1, and wherein 

sequences Bi..-Bn and/or Qn+a+b+1 • • -Qn+a+b+j ^^epresent 
recognition sites for restriction enzymes, wherein the 
recognition sites are orientated such that their cleavage 
site upon cleavage generates a cohesive end including the 
two bases designated Z. 

8. A bank of genes according to claim 1 , wherein the four sets 
of oligonucleotide sequences are present at a ratio of 

(i) : (ii) : (iii) : {iv) of (0 to 1) : (0 to 1) : {0 to 1) : (0 to 1) 

with the proviso that at least one of said sets is 
present . 

9. A bank of genes according to claim 7 or 8 , wherein 
subsequences Bi...Bj^ and/or Qn+a+b+1 • - -Qn+a+b+j represent 
recognition sites for restriction enzymes and wherein the 
recognition sites are orientated such that their cleavage 
site upon cleavage generates a cohesive end including the 
two bases designated Z. 

10. A bank of genes according to claim 7, 8 or 9, wherein the 
cohesive end is a 2 bp single strand end formed by the two 
bases designated Z. 

11* A bank of genes wherein said genes comprise a double 
stranded DNA sequence which is represented by the 
following formula of one of their strands: 
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5'BiB2B3 . • -BnXn+i. , •Xj^4.aZn+a+lZn+a+2Xn+a+3 • • •Xn+a+bQn+a+b+ 
1 

• • -Qn+a+b+j-^ ' 

wherein n, a, b and j are integers and 
n>3, a>l, b>3 and j > 1, 

wherein X^+i • • -^n+a+b a hypervariable sequence and 
X, Z and Q represent adenine (A) , cytosine (C) , guanine 
(G) or thymine (T) , and wherein 

the following six sets of oligonucleotide sequences 
comprising Xn^.^/ Zj^+^+l 2n+a+2 present, preferably 

at a ratio of (i) : (ii) : (iii) : (iv) : (v) : (vi) of about 
3:4:3:4:4:1, wherein the six sets are characterized as 
follows : 

first set: Xn+a represents A, G and/or T, preferentially 
at a ratio of about 1:1:1 or Xn+a represents C, G and/or 
T, preferentially at a ratio of about 1:1:1, Z^+a+l 
represents G and Zn+a+2 represents G; 

second set: X^+a represents A, C, G and/or T, 
preferentially at a ratio of about 1:1:1:1, Zn+a+1 
represents C and Zrx4.a+2 represents T; 

third set: Xj^^.^ represents A, C and/or G, preferentially 
at a ratio of about 1:1:1, Z^^^+l represents A and Zn+a+2 
represents A; 



fourth set: X^+a represents A, C, G and/or T, 
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preferentially at a ratio of about 1:1:1:1, Z^+a+l 
represents A and Zn+a+2 represents C; 

fifth set: X^^^ represents A, C, G and/or T, 
preferentially at a ratio of about 1:1:1:1, ^n^a+l 
represents T and Zn4.a+2 represents C; 

sixth set: X^+a represents A, Z^+a+l represents T and 
2n+a+2 represents G. 



12. A bank of genes according to claim 11, wherein the six sets 
of oligonucleotide sequences are present at a ratio of 

(i) : (ii) : (iii) : (iv) : (v) : (vi) of (0 to 1} : (0 to 1) : (0 to 
1) : (0 to 1) : (0 to 1) : (0 to 1) with the proviso that at 
least one of said sets is present. 

13. A bank of genes according to any of claims 7 to 12, wherein 
each gene is provided as display vector, especially as M13 
phage or M13 like phage or as phagemid. 



14. A bank of genes according to any of the preceding claims, 
wherein the double stranded DNA sequence according to 
claim 1 or claim 7 is comprised by a DNA region encoding a 
peptide or a protein. to be displayed. 

15. A bank of genes according to any of the preceding claims, 
characterized in that n=j = 6, a=14 and b = 16 . 



16. A bank of genes according to any of the preceding claims, 
wherein the restriction enzyme is a type IIS restriction 
enzyme. 
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17. A bank of genes according to any of the preceding claims, 
characterized in that 

(a) subsequence Bi-.-B^ is the recognition site for the 
restriction enzyme Bpml (CTGGAG) and subsequence 
Qn+a-fb+1- • -Qn+a+b+j is an inverted Bsgl recognition 
siteCTGCAC) ; or 

(b) subsequence Bi- . .B^ is the recognition site for the 
re -strict ion enzyme Bsgl (GTGCAG) and subsequence 
Qn+a+b+1- • -Qn+a+b+j inverted Bpml recognition site 
(CTCCAG) . 

18. A bank of genes according to any of the preceding claims, 
characterized in that the hypervariable sequence Xn4-l—- 
Xn+a+b contains NNB or NNK wherein 

N « adenine (A) , cytosine (C) , guanine (G) or thymine (T) ; 
B « cytosine (C) , guanine (G) or thymine (T) ; and 
K = guanine (G) or thymine (T) . 

19. A phagemid pR0C0S4/7 of the sequence shown in Fig. 6- 

20. A phagemid pROCOS5/3 of the sequence shown in Fig. 7. 

21. A method for the production of large 

- phage-display libraries or 

- phagemid-display libraries, 

containing or consisting of optionally packaged recombined 
display vectors, wherein recombination takes place at the 
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cleavage site(s) for a restriction enzyme (cut (B) enzyme, 
arrow in Pig, 3) and wherein 

(a) to (b) a double- stranded DNA prepared from Escherichia 
coli cells containing a display vector population, 
consisting of M13 phages or M13 like phages or consisting 
of phagemids according to any of the preceding claims; a 

cosmid vector; a restriction enzyme for cut (B) ; and a 
restriction enzyme for cut (A) are selected, wherein 

(i) the cut (B) enzyme cleaves the display vectors in the 
region encoding the displayed peptide or displayed protein 
(arrow in Pig. 3) and generates unique non- symmetrical 
cohesive ends, wherein each cohesive end is a 2 bp single 
strand end formed by the two bases designated Z, and 

ii) the cut (A) enzyme cleaves the display vectors and the 
cosmid vector and generates upon cleavage unique non- 
symmetrical cohesive ends which differ from those 
resulting from cut (B) , 

c) the display vectors are cleaved with the first 
restriction enzyme, 

d) the display vector and the cosmid vector are cleaved 
with the second restriction enzyme, 

e) the cleaved display vectors are ligated with the 
cleaved cosmid vectors forming concatamers, 

f) the ligation product is subjected to a lambda packaging 
and transduced into an Escherichia coli host, 



wo 98/33901 PCT/EP98/00533 

-62- 

g) if wanted, selection is made for a gene present in the 
ligated display vectors, 

h) the transduced display vectors in the Escherichia coli 
host are 

- either in the case of a phage-display vector 
spontaneously packaged in M13 or M13 like phage coats 

- or in the case of a phagemid-display vector packaged by 
infecting the Escherichia coli host with an M13 type 
helper phage (superinfection) , 

i) the packaged display vectors are passaged in a fresh 
Escherichia coli host and phage-display or phagemid- 
display libraries are formed and, if wanted, 

j) the passaged display vectors are 

- either in the case of a phage-display vector 
spontaneously packaged in M13 or M13 like phage coats 

- or in the case of a phagemid-display vector packaged by 
infecting the fresh Escherichia coli host with an M13 type 
helper phage (superinfection) and 

phage-display or phagemid-display libraries are formed. 

22. Method according to claim 21, characterized in that in 

steps (a) to (b) a type IIS restriction enzyme is selected, 
preferably Bgll, Dralll, Bsgl or Bpml . 



23. Method according to claim 21 or 22, characterized i 
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for cuts (B) and (A) the same restriction and/or 
restriction enzyme is selected. 

24. Method according to claim 21 or 22, characterized in that 
as cut (B) enzyme and as cut (A) enzyme different enzymes 
are used (Fig. 3) , preferably Bpml or bJ/i as cut (B) 



enzyme and Drain as cut (A) enzyme (fd or M13 replication 
origin cut) . 

25. Method according to any of claims 21 to 24, characterized 
in that in step (h) and facultatively in step (j) M13K07 
is used as M13 type helper phage. 

26. Method according to any of claims 21 to 25, characterized 
in that the phagemid and the cosmid are identical and 
presence and cleavage with cut (A) enzyme is optional 
and/or cut (B) enzyme and cut (A) enzyme are identical. 

27. Method according to any of claims 21 to 26, characterized 
in that in step (i) the multiplicity Qf infection (MOD is 
less than or equal to 1. 

28. Method according to any of claims 21 to 27, wherein the 
cosmid comprises an fd or M13 bacteriophage origin 

(replication/packaging) . 

29. Method according to any of claims 21 to 28, wherein in step 

(e) according to claim 21 a mol ratio of display vectors 
to the cosmid vector within the range of from 3:1 to 15:1 
and preferably 3:1 to 10:1 is used. 



30. Method according to any of claims 21 to 29, wherein in step 
(e) according to claim 21 a vector concentration 
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(comprising display vectors and cosmid vectors) of more 
than 100 fxg DNA/ml is used. 



A method for the production of large 



- phage "display extension libraries or 



- phagemid- display extension libraries, wherein 

an oligonucleotide cassette of d bases in length is 
inserted into a restriction site (cut (^) ) via the 
cohesive ends ZZ as defined in any of claims 1 to 18 to 
yield a sequence or a gene comprising a double stranded 
DNA sequence which is represented by the following formula 
of one of their strands: 

5'Bi. .3^X^4.1. •Xn+a'2n+a+lZn+a+2Xn+a+3 • -^j^+a+d^n+a+d+l^n+a+d 
+2^n+a+d+3 • -^n+a+d+bGn+a+d-fb+l • -Qn+a+d+b+j^ ' 

wherein d is an integer and a multiple of 3, preferably 
within the range of from 6 to 36; n, a, b and j and B, X, 
Z and Q have the same meaning as in any of the preceding 
claims; and wherein 

(a) to (b) a double -stranded DNA prepared from Escherichia 
coli cells containing a display vector population, 
consisting of M13 phages or M13 like phages or consisting 
of phagemids according to any of the preceding claims; a 
cosmid vector; a restriction enzyme for cut (B) ; and a 
restriction enzyme for cut (A) are selected, wherein 



(i) the cut (B) enzyme cleaves the display vectors in the 
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region encoding the displayed peptide or displayed protein 
and generates unique non- symmetrical cohesive ends; 
wherein each cohesive end is a 2 bp single strand end 
formed by the two bases designated Z, 

(ii) the cut (A) enzyme cleaves the display vectors and the 
cosmid vector such that unique non- symmetrical cohesive 
ends are formed which differ from those resulting from cut 
(B) , 

(cl) the display vectors are cut with the cut (B) 
restriction enzyme, 

{c2) a DNA cassette is inserted into the cleavage site 
with their ZZ cohesive ends, 

(d) the resulting display vector and the cosmid vector are 
cleaved with the cut (A) restriction enzyme, 

(e) the cleaved display vectors are ligated with the 
cleaved cosmid vectors forming concatamers, 

(f) the ligation product is subjected to a lambda packaging 
and transduced into an Escherichia coli host such that the 
DNA cassette lies between two hypervariable sequences 

(extension sequences) , 

(g) if wanted/ selection is made for a gene present in the 
ligated display vectors, 

(h) the transduced display vectors in the Escherichia coli 
host are 
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- either in the case of a phage-display vector 
spontaneously packaged in M13 or M13 like phage coats 

- or in the case of a phagemid- display vector packaged by 
infecting the Escherichia coli host with an M13 type 
helper phage (superinfection) , 

(i) the packaged display vectors are passaged in a fresh 
Escherichia coli host and phage-display or phagemid- 
display libraries are formed, and, if wanted, 

(j) the passaged display vectors are 

- either in the case of a phage-display vector 
spontaneously packaged in M13 or M13 like phage coats 

- or in the case of a phagemid-display vector packaged by 
infecting the fresh Escherichia coli host with M13 type 
helper phages (superinfection) and 

phage-display or phagemid-display extension libraries are 
formed. 

32. A method for the reassortment of the 5'- and/or 3'- 
ext ens ions in the production of large recombinant 

- phage-display extension libraries or 

- phagemid-display extension libraries, 

comprising the sequence defined in claim 31, wherein 
recombination takes place at one or the other, or 
consecutively at both the cleavage site(s) ZZ bracketting 
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the inserted cassette (s), wherein 

(a) to (b) a double -stranded DNA prepared from Escherichia 
coli cells containing a display vector population, 
consisting of M13 phages or M13-like phages or consisting 
of phageraids as display vectors according to claim 31; a 
cosmid vector; a restriction enzyme for cut (B) ; and 
restriction enzyme for cut (A) are selected, wherein 

(i) the cut (B) enzyme cleaves the display vectors in the 
region encoding the displayed peptide or displayed protein 
and generates unique non- symmetrical cohesive ends at 
selectively either 

the 5' -junction of extension and cassette (cleavage 
by the restriction enzyme recognizing the binding site 
Bx.--Bn in claims 1 and 31), or 

at the 3 ' -junction of extension and cassette 
(cleavage by the restriction enzyme recognizing the 
binding site Qn+a+b+1 • • -Qn+a+b+j claim 1, or 
Qn+a+d+b+l---Qn+a+d+b+j in claim 31), wherein each 
cohesive end is a 2 bp single strand end formed by the 
two bases designated Z, 

(ii) the cut (A) enzyme cleaves the display vectors and 
the cosmid vector and generates upon cleavage unique 
non- symmetrical cohesive ends which differ from those 
resulting from cut (B) , 

(b) the display vectors are cleaved with the first 
restriction enzyme, 



(c) the display vector and the cosmid vector are cleaved 
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with the second restriction enzyme 



(e) the cleaved display vectors are ligated with the 
cleaved cosmid vectors forming concatemers, 

(f) the ligation product is subjected to a lambda 
packaging and transduced into an Escherichia coli host, 

(g) if wanted, selection is made for a gene present in 
the ligated display vectors, 

(h) the transduced display vectors in the Escherichia 
host are 

- either in the case of a phage-display vector 
spontaneously packaged in M13 or M13-like phage coats 

- or in the case of phagemid-display vectors packaged by 
infecting the Escherichia coli host with an M13-type 
helper bacteriophage (superinfection) , 

(i) the packaged display vectors are passaged in a fresh 
Escherichia coli host and phage-display or phagemid- 
display libraries are formed and, if wanted, 

(j) the passaged display vectors are 

- either in the case of a phage-display vector 
spontaneously packaged in an M13 or M13-like phage coats 

- or in the case of a phagemid vector packaged by 
infecting the fresh Escherichia coli host with M13 type 
helper phages (superinfection) and 
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phage-display or phagemid-display libraries are formed. 

33. Method according to claim 31 or 32, characterized in 
that in steps (a) to (b) a type lis restriction enzyme is 
selected, preferably Bgll, Dralll; Bsgl or Bpml . 

34 • Method according to claim 31 or 32, characterized in 
that for cuts (B) and (A) the same restriction site is 
selected. 

35. Method according to claim 31 or 32, characterized in 
that as cut (B) enzyme and as cut (A) enzyme different 
enzymes are used, preferably Bsgl or Bpml as cut (B) 
enzyme and Dralll as cut (A) enzyme (fd or M13 replication 
origin is cut) • 

36. Method according to any of claims 21 to 35, 

( 

characterized in that\step/M^ (h) and facultatively in 
step (j) M13K07 is used as the M13-type helper phage. 

37. Method according to any of claims 31 to 36, 
characterized in that in step (g) selection is made for 
the presence of an antibiotic resistance gene. 

38. Method according to any of claims 31 to 37, 
characterized in that in step (i) the multiplicity cif 
infection {MOD is less than or equal to 1. 

39. Method according to claims 31 to 38, wherein the 
cosmid comprises an fd or M13 bacteriophage origin. 

40. Method according to claims 31 to 39, wherein in step 
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(e) according to claim 31 or 32 a mol ratio of display 
vectors to the cosmid vector within the range 3:1 to 15:1 
and preferably 3:1 to 10:1 is used. 

41. Method according to claims 31 to 40, wherein in step 
(e) according to claim 31 or 32 a vector concentration 
(comprising display vectors and cosmid vectors) of more 

than 10 0 fig DNA/ml is used. 

42. Method for the de novo production of large 

- phage-display libraries or 

- phagemid-display libraries, 

comprising DNA sequences according to any of the claims 1 
to 18, and subjectable to recombination according to a 
procedure according to any of claims 21 to 41, wherein 
recombination takes place within a DNA sequence according 
to any of the preceding claims, especially claim 21 or 31, 
wherein 

a) a display vector, consisting of an M13 phage or M13- 
like phage or consisting of a phagemid-display vector 
comprising a bacteriophage replication origin, 
facultatively a gene for a selectable marker, preferably 
for an antibiotic resistance, a lambda bacteriophage cos- 
site and a "stuffer" -sequence (Figure 5, upper right), 
containing two binding sites for a type lis restriction 
enzyme different from any of the enzymes according to the 
previous claims, wherein said two sites are oriented in 
divergent orientation and where the cohesive ends 
generated on cleavage are non- symmetrical and differ from 
one another at the two sites, and 
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b) a PCR-generated fragment comprising part of one of the 
sequences according to any of claims 1 to 31, including a 
(the) hypervariable sequence (s), preferably 
Xn+1- ■Xn+aZn+a+lZn+a+2Xn+a+3- -Xn+a+b according to claim 1, 
bracketted by the same type lis restriction enzyme binding 
sites defined in (a) , but in this case both oriented 
inwards towards the hypervariable sequence (Figure 5, left 
side) and where on cleavage by this restriction enzyme two 
non- symmetrical, single strand ends different from one 
another are generated, where the first end (a' in Fig. 5) 
is complementary to one of the ends (a in Fig. 5) 
generated on the large vector fragment in (a) and the 
second end (b* in Fig. 5) is complementary to the other 
end (b in Fig. 5) generated on the large vector fragment 
in (a) , 

c) the two cleavage reaction systems (a) and (b) still 
containing the active type lis restriction enzyme are 
mixed together in approximately equimolar proportions and 
subjected to ligation in the presence of DNA ligase; 
fragments containing the restriction enzyme binding sites 
are constantly removed ("stuffer" fragment and outer end 
of the PGR product) whereas 

the other two components, namely the large vector 

fragment and the insert sequence (central fragment from 

the PGR reaction) are driven to form 
- A) a concatameric hybrid if the ligation is carried 

out at > 100 jig DNA/ml (Figured) , or 

B) a circular hybrid if the ligation is carried out 

at < or = 40 fig DNA/ml, 
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dl) in the case of protocol A) the DNA is packaged into 
lambda particles and transduced into an Escherichia coli 
host, 

d2) in the case of protocol B) the DNA is transformed in 
an Escherichia coli host, 

e) if wanted, selection is made for a gene present in 
the ligated display vectors/ 

f ) the transduced display vectors in the Escherichia 
coli host are 

- either in the case of a phage-display vector 
spontaneously packaged in M13 or M13-like phage coats 

- or in the case of phagemid-display vectors packaged by 
infecting the Escherichia coli host with an M13-type 
helper bacteriophage (superinfection) , 

(g) the packaged display vectors are passaged in a fresh 
Escherichia coli host and phage-display or phagemid- 
display libraries are formed and, if wanted 

(h) the passaged display vectors are 

- either in the case of a phage-display vector 
spontaneously packaged in an M13 or M13-like phage coats 

- or in the case of a phagemid vector packaged by 
infecting the fresh Escherichia coli host with M13-type 
helper phages (superinfection) and 
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phage-display or phagemid- display libraries are formed.* 

43. Method according to claim 42, characterized in that in 
steps (a) to (b) , as type lis restriction enzyme, 
preferably Bpil, Bsgl or Bpml is selected. 

44. Method according to claims 42 or 43, characterized in that 
in step (f ) and facultatively in step (h) M13K07 is used 
as the M13-type helper phage. 

45. Method according to any of the claims 42 to 44, 
characterized in that in step (e) selection is made for 
the presence of an antibiotic resistance gene. 

46. Method according to any of the claims 42 to 45, 
characterized in that in step (g) the multiplicity af 
infection (MOI) is less than or equal to 1. 

47. A phage-display library or a phagemid- display library in 
the form of packaged particles obtainable according to any 
of claims 21 to 46. 

48. A phage-display library or a phagemid- display library in 
the form of display vectors comprised by Escherichia coli 
population(s) obtainable according to any of the claims 21 
to 46. 

49. Phage-display libraries or phagemid libraries, 
characterized by genes according to any of claims 1 to 20 
and obtainable according to any of claims 21 to 46, 
wherein the term „ large according to claims 21, 31, 32 
and 42| largc^is defined as in excess of 10^ variant 
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clones, preferentially 108 to 10^^ variant clones. 



50. Protein or peptide comprising a peptide sequence encoded by 
a DNA sequence according to any of claims 1 to 18 and 
obtainable by affinity selection procedures on a defined 
target by means of a library according to claim 48 or 49. 
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Figure 1. 
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Figure 3. 
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Figure 4. 
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1 ggcgagctcc cgtgcagcgc tccaggtacc ccgatatcag agctgaaact gttgaaagtt 
61 gtttagcaaa atcccataca gaaaattcat ttactaacgt ctggaaagac gacaaaactt 
121 tagatcgtta cgctaactat gagggctgtc tgtggaatgc tacaggcgtt gtagtttgta 
181 ctggcgacga aactcagcgt tacggcacat gggctcctat: tgggctcgct atccctgaaa 
241 acgagggtgg tggctctgag ggtggcggtt ctgagggtgg cggttctgag ggtggcggta 
301 ctaaacctcc tgagtacggt gatacaccta ttccgggcta tacttatatc aaccctctcg 
361 acggcaccca tccgcctggt accgagcaaa accccgctaa tcctaatccc tctctcgagg 
421 agnctcagcc tcttaatact ttcatgtttc agaataacag gctccgaaac aggcaggggg 
481 catcaactgt ttatacgggc actgttactc aaggcaccga ccccgtcaaa acttactacc 
541 agtacactcc tgtatcatea aaagccatgt atgacgctta ctggaacggc aaattcagag 
601 accgcgcttt ccattctggc tttaatgaag atccattcgt ttgtgaatac caaggccaac 
661 cgtctgacct gcctcaacct cctgtcaatg ctggcggcgg ctccggtggc ggttctggtg 
721 gcggccctga gggcggtggc tctgagggtg gcggttccga gggtggcggc cctgagggag 
781 gcggttccgg tggtggctcn ggttccggtg attttgatta tgaaaagatg gcaaacgcta 
841 ataagggggc tatgaccgaa aacgccgatg aaaacgcgct acagtccgac gctaaaggca 
901 aacctgattc tgtcgctacc gattacggtg ctgctaccga tggtttcatt ggcgacgttt 
961 ccggccttgc taatggtaac ggcgctactg gtgattttgc tggctctaat tcccaaatgg 
1021 cncaagtcgg tgacggtgat aattcacctt taatgaataa tttccgtcaa tatttacctt 
1081 ccctccccca atcggttgaa tgtcgcccct ttgtctttgg cgctggtaaa ccatatgaat 
1141 tttctactga ttgtgacaaa ataaacttat tccgtggtgt ctttgcgctt cttttatacg 
1201 ttgccaccct tatgtatgta ttttctacgt ttgctaacat: actgcgcaac aaggagtctt 
1261 aatgactcca gaggtcgaaa ttcacctcga aagcaagctg ataaaccgac acaattaaag 
1321 gccccttttg gagccttttt ttttggagat tttcaacgcg aaaaaatcat tattcgcaat 
1381 cccaagctaa ttcacctcga aagcaagctg ataaaccgat acaattaaag gctccttttg 
1441 gagccttttt ttttggagat tttcaacgtg aaaaaattat tattcgcaat tccaagctct 
1501 gcctcgcgcg tttcggtgat gacggtgaaa acctctgaca catgcagctc ccggagacgg 
1561 tcacagcttg tctgtaagcg gatgcagatc acgcgccctg tagcggcgca ttaagcgcgg 
1621 cgggtgtggt ggttacgcgc agcgtgaccg ctacacttgc cagcgcccta gcgcccgctc 
1681 ctttcgcttt cttcccttcc tttctcgcca cgttcgccag ctttccccgt caagctctaa 
1741 atcgggggct ccctttaggg ttccgattta gtgctttacg gcacctcgac cccaaaaaac 
1801 tcgattaggg tgatggttca cgtagtgggc catcgccctg atagacggtt tttcgccctt 
1861 tgacgttgga gtccacgttc tttaatagtg gactcttgrt ccaaactgga acaacactca 
1921 accctatctc ggtctattct tttgatttat aagggatttt gccgatttcg gcctattggt 
1981 taaaaaatga gctgatttaa caaaaattta acgcgaattt taacaaaata ttaacgttta 
2041 caatttgatc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg 
2101 gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg agcaaaaggc 
2161 cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca taggctccgc 
2221 ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga 
2281 ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc tgttccgacc 
2341 ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc gctttctcaa 
2401 tgctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct gggctgtgtg 
2461 ::acgaacccc ccgttcagcc cgaccgctgc gccttatccg gtas^ctatcg tcttgagtcc 
2521 aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga 
2581 gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact 
2641 agaaggacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt 
2701 ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag 
2761 cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg 
2821 tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag attatcaaaa 
2881 aggaccttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata 
2941 tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc tatctcagcg 
3001 atctgtctat. ttcgttcatc catagttgcc tgactccccg tcgtgtagac aactacgata 
3061 cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc acgcccaccg 
3121 gctccgcttt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag aagtggtcct 
3181 gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag agtaagtagt 
3241 ccgccagtta atagtttgcg caacgttgtt gccattgctg caggcaccgc ggtgtcacgc 
3301 tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gaccaaggcg agttacatga 
3361 tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgarcgt tgccagaagt 
3421 aagttggccg cagtgttacc actcatggct atggcagcac tgcataaztc tcttactgtc 
3481 atgccatccg taagatgctt ttctgtgacc ggtgagtact caaccaagtc attctgagaa 



wo 98/33901 



PCT/EP98/00533 



3541 tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa cacgggataa taccgcgcca 
3601 cacagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg aaaaccctca 
3661 aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc caactgatct 
3721 tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc 
3781 gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcataccctt cctttttcaa 
3841 tattattgaa gcagacagtt ttattgttca tgatgatata tttttatctt gtgcaatgta 
3901 acatcagaga ttttgagaca caacagatct ggccatcatg atggaatgtt cccccggcgg 
3961 tgttatctgg cagcagtgcc gccgatagta tgcaattgat aattactacc atttgcgggt 
4021 cctttccggc gatccgcctt gtcacggggc ggcgacctzcg cgggttttcg ctatttatga 
4081 aaattttccg gtttaaggcg tttccgttct tcttcgtcat aacttaatgt ttttatttaa 
4141 aataccctct gaaaagaaag gaaacgacag gtgctgaaag cgagcttttt ggccacgatg 
4201 cgtccggcgt agaggatctc tcacctacca aacaatgccc ccctgcaaaa aataaattca 
4261 tacaaaaaac atacagataa ccatctgcgg tgataaatta tctctggcgg tgttgacaca 
4321 aataccactg gcggtgatac tgagcacatc agcaggacgc actgaccacc atgaaggtga 
4381 cgctcttaaa attaagccct gaagaagggc agcattcaaa gcagaaggct ttggggtgtg 
4441 tgacacgaaa cgaagcattg gaattctaca acttgcttgg attcctacaa agaagcagca 
4501 attttcagtg tcagaagtcg accaaggagg tctagataac gagggcaaaa aatgaaaaag 
4561 acagctatcg cgattgcagt ggcactggct ggtttcgcta ccgtagcgca ggcc 
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1 ggcgagctcc cgtgcagcgg tcttcagcgc ttgccgtctg accgtagcgc tggaagacgc 
61 tccagagggt accccgatat cagagctgaa actgttgaaa gttgtttagc aaaatcccat 
121 acagaaaatt catttactaa cgtctggaaa gacgacaaaa ctttagatcg ttacgctaac 
181 tatgagggct gtctgtggaa tgctacaggc gttgtagctt gtactggtga cgaaactcag 
241 tgctacggta catgggttcc tattgggctt gctatccctg aaaacgaggg tggtggctct 
301 gagggtggcg gttctgaggg tggcggttct gagggtggcg gtactaaacc tcctgagtac 
361 ggtgacacac ctattccggg ctatacttat atcaaccctc tcgacggcac ttatccgcct 
421 ggtactgagc aaaaccccgc taatcctaat ccttctcttg aggagtctca gcctcttaat 
481 acttccatgt ttcagaataa taggtcccga aataggcagg gggcatcaac tgtttatacg 
541 ggcactgtta ctcaaggcac tgaccccgtt aaaacccatt accagcacac tcctgtatca 
601 tcaaaagcca tgtacgacgc ttactggaac ggtaaattca gagaccgcgc tttccattct 
661 ggctttaatg aagatccatt cgtttgtgaa tatcaaggcc aatcgtctga cctgcctcaa 
721 cctcctgtca atgctggcgg cggctctggt ggtggttctg gtggcggctc tgagggtggt 
781 ggctctgagg gtggcggttc tgagggtggc ggctctgagg gaggcggtcc cggtggtggc 
841 tccggttccg gtgattttga ttatgaaaag atggcaaacg ctaataaggg ggctatgacc 
901 gaaaatgccg atgaaaacgc gctacagtct gacgctaaag gcaaacttga tcctgtcgct 
961 actgattacg gtgctgccat cgatggtttc attggtgacg tttccggcct tgctaatggt 
1021 aatggtgcta ctggtgattt tgctggctct aattcccaaa tggcccaagc cggtgacggt 
1081 gataattcac ctttaatgaa taatttccgc caatatttac cttccctccc tcaatcggtt 
1141 gaatgccgcc cttttgtcct tggcgctggc aaaccatatg aattttctac cgattgtgac 
1201 aaaataaact tattccgtgg tgtctttgcg tttcttttat atgttgccac ctttatgtat 
1261 gtattttcta cgtttgctaa catactgcgt aataaggagt ctcaatgact ctagaggccg 
1321 aaattcacct cgaaagcaag ctgacaaacc gatacaatta aaggctcctt ttggagcctt 
1381 tttttttgga gattttcaac gtgaaaaaat tattattcgc aattccaagc taattcacct 
1441 cgaaagcaag ctgataaacc gatacaatta aaggctcctt ttggagcctt tttttttgga 
1501 gattttcaac gtgaaaaaat tattattcgc aattccaagc tctgcctcgc gcgtttcggt 
1561 gatgacggtg aaaacctctg acacatgcag ctcccggaga cggtcacagc ttgtctgtaa 
1621 gcggatgcag atcacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg 
1681 cgcagcgtga ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct 
1741 tcctttctcg ccacgttcgc cagctttccc cgtcaagctc taaatcgggg gctcccttta 
1801 gggttccgat ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt 
1861 tcacgtagtg ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg 
1921 ttctttaata gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat 
1981 tcttttgatt tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt 
2041 taacaaaaat ttaacgcgaa ttttaacaaa atattaacgt ttacaatttg atctgcgctc 
2101 ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag gcggtaatac ggttatccac 
2161 agaatcaggg gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa 
2221 ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc cgcccccctg acgagcatca 
2281 caaaaatcga cgctcaagtc agaggtggcg aaacccgaca ggactataaa gataccaggc 
2341 gtttccccct ggaagctccc tcgtgcgctc tcctgttccg accctgccgc ttaccggata 
2401 cctgtccgcc tttctccctt cgggaagcgt ggcgctttct caatgctcac gctgtaggta 
2461 tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac cccccgttca 
2521 gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag tccaacccgg taagacacga 
2581 cttatcgcca ctggcagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg 
2641 tgctacagag ttcttgaagt ggtggcctaa ctacggctac actagaagga cagtatttgg 
2701 tatctgcgct ctgctgaagc cagttacctt cggaaaaaga gttggtagct cttgatccgg 
2761 caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag 
2821 aaaaaaagga tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa 
2881 cgaaaactca cgttaaggga ttttggtcat gagattatca aaaaggatct tcacctagat 
2941 ccttttaaat taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc 
3001 tgacagttac caatgcttaa tcagtgaggc acctatctca gcgatctgtc tatttcgttc 
3061 atccatagtt gcctgactcc ccgtcgtgta gataactacg atacgggagg gcttaccatc 
3121 tggccccagt gctgcaatga taccgcgaga cccacgccca ccggctccgc ttttatcagc 
3181 aataaaccag ccagccggaa gggccgagcg cagaagtggt cctgcaacct tatccgcctc 
3241 catccagcct attaattgtt gccgggaagc tagagtaagt agttcgccag ttaatagttt 
3301 gcgcaacgtt gttgccattg ctgcaggcat cgtggtgtca cgctcgtcgt ttggtatggc 
3361 ttcattcagc tccggttccc aacgatcaag gcgagttaca tgatccccca tgttgtgcaa 
3421 aaaagcggtt agctccttcg gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt 
3481 atcactcatg gttatggcag cactgcataa ttctcttact gtcatgccac ccgtaagatg 
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3541 cttttctgtg actggtgagt actcaaccaa gtcattctga gaatagtgta tgcggcgacc 
3601 gagttgctct tgcccggcgt caacacggga taataccgcg ccacatagca gaacttcaaa 
3661 agtgctcatc attggaaaac gttctccggg gcgaaaactc tcaaggatct taccgctgtt 
3721 gagatccagt tcgatgtaac ccactcgtgc acccaactga tcttcagcac cttttacttt 
3781 caccagcgtt tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa agggaataag 
3841 ggcgacacgg aaatgttgaa tactcatact cttccttttt caatattact gaagcagaca 
3901 gttttattgt tcatgatgat acatttttat cttgtgcaat gtaacaccag agattttgag 
3961 acacaacaga tctggccatc atgatggaat gtttccccgg tggtgttatc tggcagcagt 
4021 gccgtcgata gtatgcaatt gataattatt atcatttgcg ggtcccttcc ggcgatccgc 
4081 cttgttacgg ggcggcgacc tcgcgggttt tcgctattta tgaaaacttt ccggtttaag 
4141 gcgtttccgt tcttcttcgt cacaacttaa tgtttttatt taaaataccc tctgaaaaga 
4201 aaggaaacga caggtgctga aagcgagctt tttggccacg atgcgtccgg cgtagaggat 
4261 ctctcaccta ccaaacaatg cccccctgca aaaaataaat tcatataaaa aacacacaga 
4321 taaccatctg cggtgataaa ttatctctgg cggtgttgac ataaatacca ctggcggtga 
4381 tactgagcac atcagcagga cgcactgacc accatgaagg tgacgctctt aaaattaagc 
4441 cccgaagaag ggcagcattc aaagcagaag gctttggggt gtgtgatacg aaacgaagca 
4501 ttggaattct acaacttgct tggattccta caaagaagca gcaactttca gtgtcagaag 
4561 tcgaccaagg aggtctagat aacgagggca aaaaatgaaa aagacagcta tcgcgattgc 
4621 agcggcactg gctggtttcg ctaccgtagc gcaggcc 



